Skip to content

Forest Optimization

Introduction

The forest-optimizer calculates the expected improvement of the position space with a tree-based model. This optimization technique is very similar to bayesian-optimization in every part, except its surrogate model.

Example

from gradient_free_optimizers import ForestOptimizer

...

opt = ForestOptimizer(search_space, xi=0.15)
opt.search(objective_function, n_iter=100)

About the implementation

The forest-optimizer shares most of its code base with the bayesian-optimizer. Only its model to calculate the expected score \(\mu\) and its standard deviation \(\sigma\) is different. Tree based models do not calculate the standard deviation by them self. A modification is necessary to determine the standard deviation from the impurity of the trees in the ensemble. This modification is taken from the paper "Algorithm Runtime Prediction: Methods & Evaluation" chapter 4.3.3.

Parameters

xi

Parameter for the expected uncertainty of the estimation. It is a parameter that belongs to the expected-improvement acquisition-function.

  • type: float
  • default: 0.3
  • typical range: 0.1 ... 0.9

replacement

This parameter determines if a position is replaced into the list of possible positions after it was selected and evaluated by the sequential model.

  • type: bool
  • default: True
  • possible values: True, False

tree_regressor

The tree_regressor-parameter determines the type of surrogate model the forest-optimization algorithm will use to calculate \(\mu\) and \(\sigma\).

  • type: string
  • default: extra_tree
  • possible values: "extra_tree", "random_forest", "gradient_boost"

max_sample_size

The max_sample_size is a first pass of randomly sampling, before all possible positions are generated for the sequence-model-based optimization. It samples the search space directly and takes effect if the search-space is very large:

search_data = {
  "x1": np.arange(0, 1000, 0.01),
  "x2": np.arange(0, 1000, 0.01),
  "x3": np.arange(0, 1000, 0.01),
  "x4": np.arange(0, 1000, 0.01),
}

The max_sample_size-parameter is necessary to avoid a memory overload from generating all possible positions from the search-space. The search-space above corresponds to a list of \(100000^4 = 100000000000000000000\) numpy arrays. This memory overload is expected for a sequence-model-based optimization algorithm, because the surrogate model has the job make a prediction for every position in the search-space to calculate the acquisition-function. The max_sample_size-parameter was introduced to provide a better out-of-the-box experience if using smb-optimizers.

  • type: int
  • default: 10000000
  • typical range: 1000000 ... 100000000

sampling

The sampling-parameter is a second pass of randomly sampling. It samples from the list of all possible positions (not directly from the search-space). This might be necessary, because the predict-method of the surrogate model could overload the memory.

  • type: dict
  • default: {'random': 1000000}
  • typical range: -

warm_start_smbo

The warm_start_smbo-parameter is a pandas dataframe that contains search-data with the results from a previous optimization run. The dataframe containing the search-data could look like this:

x1 x2 score
5 15 0.3
10 12 0.7
... ... ...
... ... ...

Where the corresponding search-space would look like this:

search_space = {
  "x1": np.arange(0, 20),
  "x2": np.arange(0, 20),
}

Before passing the search-data to the optimizer make sure, that the columns match the search-space of the new optimization run. So you could not add another dimension ("x3") to the search-space and expect the warm-start to work. The dimensionality of the optimization must be preserved and fit the problem.

opt = BayesianOptimization(warm_start_smbo=search_data)
  • type: pandas dataframe, None
  • default: None
  • possible values: -

rand_rest_p

Probability for the optimization algorithm to jump to a random position in an iteration step. It is set to 0 per default. The idea of this parameter is to give the possibility to inject randomness into algorithms that don't normally support it.

  • type: float
  • default: 0
  • typical range: 0.01 ... 0.1