Random Restart Hill Climbing

Introduction

The random restart hill climbing works by starting a hill climbing search and jumping to a random new position after n_iter_restart iterations. Those restarts should prevent the algorithm getting stuck in local optima.

Example

from hyperactive.opt.gfo import RandomRestartHillClimbing
from hyperactive.experiment.integrations import SklearnCvExperiment
from sklearn.tree import DecisionTreeClassifier
from sklearn.datasets import load_wine

# Load dataset
X, y = load_wine(return_X_y=True)

# Define search space
param_grid = {
    "max_depth": [3, 5, 7, 10, None],
    "min_samples_split": [2, 5, 10, 20],
    "min_samples_leaf": [1, 2, 4, 8]
}

# Create experiment
experiment = SklearnCvExperiment(
    estimator=DecisionTreeClassifier(random_state=42),
    param_grid=param_grid,
    X=X, y=y,
    cv=5
)

# Create optimizer with restart parameters
optimizer = RandomRestartHillClimbing(
    experiment=experiment,
    n_iter_restart=25
)

# Run optimization
best_params = optimizer.solve()
print("Best parameters:", best_params)

About the implementation

The random restart hill climbing inherits its behaviour from the regular hill climbing and expands it by jumping to a random position during the iteration step if the following criteria is meet:

\[ \text{iter}_i \mathrm{\%} \text{n_iter_restart} = 0 \]

Parameters

`epsilon`

The step-size of the hill climbing algorithm. Increasing epsilon also increases the average step-size, because its proportional to the standard-deviation of the distribution of the hill-climbing-based algorithm. If epsilon is too large the newly selected positions will be at the edge of the search space. If its value is very low it might not find new positions.

type: float
default: 0.03
typical range: 0.01 ... 0.3

`distribution`

The mathematical distribution the algorithm draws samples from. All available distributions are taken from the numpy-package.

type: string
default: "normal"
possible values: "normal", "laplace", "logistic", gumbel"

`n_neighbours`

The number of positions the algorithm explores from its current postion before setting its current position to the best of those neighbour positions. If the value of n_neighbours is large the hill-climbing-based algorithm will take a lot of time to choose the next position to move to, but the choice will probably be a good one. It might be a prudent approach to increase n_neighbours of the search-space has a lot of dimensions, because there are more possible directions to move to.

type: int
default: 3
typical range: 1 ... 10

`n_iter_restart`

The number of iterations the algorithm performs before jumping to a random position.

type: int
default: 10
typical range: 5 ... 20