Lipschitz Optimization

Introduction

The lipschitz optimization (often called lipo) is a global optimization algorithm that calculates an upper bound based on all previously explored positions in the search space.

Example

from hyperactive import Hyperactive
from hyperactive.optimizers import LipschitzOptimizer

...

optimizer = LipschitzOptimizer()

hyper = Hyperactive()
hyper.add_search(model, search_space, n_iter=50, optimizer=optimizer)
hyper.run()

About the implementation

The lipschitz optimization is very similar to sequence-model-based optimizers. The upper bound is an acquisition function that is calculated for every position in the search space with the following equation:

\[ U(x) = \min_{i=1...t}(f(x_i) + k \cdot || x-x_i ||_2) \]

where:

\(f(x_i)\) => is the score of a known position
\(k\) => is the maximum slope between known positions
\(|| x-x_i ||_2\) => is the distance between a known position \(x_i\) and a position in the search space \(x\)
\(\min_{i=1...t}\) => returns the minimum value of the calculations for the known positions \(i=1...t\)

The equation above results in an upper bound that looks like the known positions in the search space are connected with pyramid shaped lines, where the peaks are between the known positions.

Parameters

`max_sample_size`

The max_sample_size is a first pass of randomly sampling, before all possible positions are generated for the sequence-model-based optimization. It samples the search space directly and takes effect if the search-space is very large:

search_data = {
  "x1": np.arange(0, 1000, 0.01),
  "x2": np.arange(0, 1000, 0.01),
  "x3": np.arange(0, 1000, 0.01),
  "x4": np.arange(0, 1000, 0.01),
}

The max_sample_size-parameter is necessary to avoid a memory overload from generating all possible positions from the search-space. The search-space above corresponds to a list of \(100000^4 = 100000000000000000000\) numpy arrays. This memory overload is expected for a sequence-model-based optimization algorithm, because the surrogate model has the job make a prediction for every position in the search-space to calculate the acquisition-function. The max_sample_size-parameter was introduced to provide a better out-of-the-box experience if using smb-optimizers.

type: int
default: 10000000
typical range: 1000000 ... 100000000

`sampling`

The sampling-parameter is a second pass of randomly sampling. It samples from the list of all possible positions (not directly from the search-space). This might be necessary, because the predict-method of the surrogate model could overload the memory.

type: dict
default: {'random': 1000000}
typical range: -

`warm_start_smbo`

The warm_start_smbo-parameter is a pandas dataframe that contains search-data with the results from a previous optimization run. The dataframe containing the search-data could look like this:

x1	x2	score
5	15	0.3
10	12	0.7
...	...	...
...	...	...

Where the corresponding search-space would look like this:

search_space = {
  "x1": np.arange(0, 20),
  "x2": np.arange(0, 20),
}

Before passing the search-data to the optimizer make sure, that the columns match the search-space of the new optimization run. So you could not add another dimension ("x3") to the search-space and expect the warm-start to work. The dimensionality of the optimization must be preserved and fit the problem.

opt = BayesianOptimization(warm_start_smbo=search_data)

type: pandas dataframe, None
default: None
possible values: -