Stochastic Hill Climbing

Introduction

Stochastic hill climbing extends the normal hill climbing by a simple method against getting stuck in local optima. It has a parameter p_accept you can set, that determines the probability to accept worse solutions as a next position. This should enable the stochastic hill climbing to find better solutions in a non-convex optimization problem over many iterations.

Example

from hyperactive import Hyperactive
from hyperactive.optimizers import StochasticHillClimbingOptimizer

...

optimizer = StochasticHillClimbingOptimizer(epsilon=0.3)

hyper = Hyperactive()
hyper.add_search(model, search_space, n_iter=50, optimizer=optimizer)
hyper.run()

About the implementation

The stochastic hill climbing inherits the behaviour of the regular hill climbing algorithm and adds its additional functionality after the evaluation of the score is done. If the new score is not better than the previous one the algorithm starts the following calculation:

\[ \text{score}_{normalized} = \text{norm} * \frac{\text{score}_{current} - \text{score}_{new}}{\text{score}_{current} + \text{score}_{new}} \]

\[ p = \exp^{-\text{score}_{normalized}} \]

If \(p\) is smaller than a random number between 0 and p_accept the new position gets accepted anyways.

Parameters

`epsilon`

The step-size of the hill climbing algorithm. Increasing epsilon also increases the average step-size, because its proportional to the standard-deviation of the distribution of the hill-climbing-based algorithm. If epsilon is too large the newly selected positions will be at the edge of the search space. If its value is very low it might not find new positions.

type: float
default: 0.03
typical range: 0.01 ... 0.3

`distribution`

The mathematical distribution the algorithm draws samples from. All available distributions are taken from the numpy-package.

type: string
default: "normal"
possible values: "normal", "laplace", "logistic", gumbel"

`n_neighbours`

The number of positions the algorithm explores from its current postion before setting its current position to the best of those neighbour positions. If the value of n_neighbours is large the hill-climbing-based algorithm will take a lot of time to choose the next position to move to, but the choice will probably be a good one. It might be a prudent approach to increase n_neighbours of the search-space has a lot of dimensions, because there are more possible directions to move to.

type: int
default: 3
typical range: 1 ... 10

`p_accept`

The probability factor used in the equation to calculate if a worse position is accepted as the new position.

type: float
default: 0.1
typical range: 0.01 ... 10