Forest Optimization

Introduction

The forest-optimizer calculates the expected improvement of the position space with a tree-based model. This optimization technique is very similar to bayesian-optimization in every part, except its surrogate model.

Example

from hyperactive import Hyperactive
from hyperactive.optimizers import ForestOptimizer

...

optimizer = ForestOptimizer(xi=0.15)

hyper = Hyperactive()
hyper.add_search(model, search_space, n_iter=50, optimizer=optimizer)
hyper.run()

About the implementation

The forest-optimizer shares most of its code base with the bayesian-optimizer. Only its model to calculate the expected score \(\mu\) and its standard deviation \(\sigma\) is different. Tree based models do not calculate the standard deviation by them self. A modification is necessary to determine the standard deviation from the impurity of the trees in the ensemble. This modification is taken from the paper "Algorithm Runtime Prediction: Methods & Evaluation" chapter 4.3.3.

Parameters

`xi`

Parameter for the expected uncertainty of the estimation. It is a parameter that belongs to the expected-improvement acquisition-function.

type: float
default: 0.3
typical range: 0.1 ... 0.9

`tree_regressor`

The tree_regressor-parameter determines the type of surrogate model the forest-optimization algorithm will use to calculate \(\mu\) and \(\sigma\).

type: string
default: extra_tree
possible values: "extra_tree", "random_forest", "gradient_boost"

`max_sample_size`

The max_sample_size is a first pass of randomly sampling, before all possible positions are generated for the sequence-model-based optimization. It samples the search space directly and takes effect if the search-space is very large:

search_data = {
  "x1": np.arange(0, 1000, 0.01),
  "x2": np.arange(0, 1000, 0.01),
  "x3": np.arange(0, 1000, 0.01),
  "x4": np.arange(0, 1000, 0.01),
}

The max_sample_size-parameter is necessary to avoid a memory overload from generating all possible positions from the search-space. The search-space above corresponds to a list of \(100000^4 = 100000000000000000000\) numpy arrays. This memory overload is expected for a sequence-model-based optimization algorithm, because the surrogate model has the job make a prediction for every position in the search-space to calculate the acquisition-function. The max_sample_size-parameter was introduced to provide a better out-of-the-box experience if using smb-optimizers.

type: int
default: 10000000
typical range: 1000000 ... 100000000

`sampling`

The sampling-parameter is a second pass of randomly sampling. It samples from the list of all possible positions (not directly from the search-space). This might be necessary, because the predict-method of the surrogate model could overload the memory.

type: dict
default: {'random': 1000000}
typical range: -

`warm_start_smbo`

The warm_start_smbo-parameter is a pandas dataframe that contains search-data with the results from a previous optimization run. The dataframe containing the search-data could look like this:

x1	x2	score
5	15	0.3
10	12	0.7
...	...	...
...	...	...

Where the corresponding search-space would look like this:

search_space = {
  "x1": np.arange(0, 20),
  "x2": np.arange(0, 20),
}

Before passing the search-data to the optimizer make sure, that the columns match the search-space of the new optimization run. So you could not add another dimension ("x3") to the search-space and expect the warm-start to work. The dimensionality of the optimization must be preserved and fit the problem.

opt = BayesianOptimization(warm_start_smbo=search_data)

type: pandas dataframe, None
default: None
possible values: -

`rand_rest_p`

Probability for the optimization algorithm to jump to a random position in an iteration step. It is set to 0 per default. The idea of this parameter is to give the possibility to inject randomness into algorithms that don't normally support it.

type: float
default: 0
typical range: 0.01 ... 0.1