Hyperopt is a popular open-source hyperparameter tuning library. Hyperopt offers two tuning algorithms: Random Search and the Bayesian method Tree of Parzen Estimators (TPE), which offer improved compute efficiency compared to a brute force approach such as grid search.
Databricks Runtime 5.4 ML and above include Hyperopt, augmented with an implementation powered by Apache Spark. By using the
SparkTrials extension of
hyperopt.Trials, you can easily distribute a Hyperopt run without making other changes to your Hyperopt usage. When applying the
hyperopt.fmin() function, you pass in the
SparkTrials can accelerate single-machine tuning by distributing trials to Spark workers.
MLflow is an open source platform for managing the end-to-end machine learning lifecycle. Databricks Runtime 5.4 ML and above support automated MLflow tracking for hyperparameter tuning with Hyperopt and
SparkTrials in Python. When automated MLflow tracking is enabled and you run
SparkTrials, hyperparameters and evaluation metrics are automatically logged in MLflow. Without automated MLflow tracking, you must make explicit API calls to log to MLflow. Automated MLflow tracking is enabled by default. To disable it, set the Spark configuration
false. You can still use
SparkTrials to distribute tuning even without automated MLflow tracking. Databricks Runtime 5.5 ML and above include MLflow so you do not need to install it separately.
Databricks does not support logging to MLflow from workers, so you cannot add custom logging code in the objective function you pass to Hyperopt.
This section describes how to configure the arguments you pass to Hyperopt, best practices in using Hyperopt, and troubleshooting issues that may arise when using Hyperopt.
fmin() documentation has detailed explanations for all the arguments. We briefly mention the important ones below:
fn: The objective function to be called with a value generated from the hyperparameter space (
fncan return the loss as a scalar value or in a dictionary (refer to Hyperopt docs for details). This is usually where most your code would be, for example, loss calculation, model training, and so on.
space: An expression that generates the hyperparameter space Hyperopt searches. A simple example is
hp.uniform('x', -10, 10), which defines a single-dimension search space between -10 and 10. Hyperopt provides great flexibility in defining the hyperparameter space. After you are familiar with Hyperopt you can use this argument to make your tuning more efficient.
algo: The search algorithm Hyperopt uses to search the hyperparameter space (
space). Typical values are
hyperopt.rand.suggestfor Random Search and
max_evals: The number of hyperparameter settings to try, that is, the number of models to fit. This number should be large enough to amortize overhead.
max_queue_len: The number of hyperparameter settings Hyperopt should generate ahead of time. Since the Hyperopt TPE generation algorithm can take some time, it can be helpful to increase this beyond the default value of 1, but generally no larger than the
parallelism: The maximum number of concurrent runs allowed. This value cannot be greater than 128 or the number of CPUs on all worker nodes of a cluster combined. Higher concurrency will usually shorten the wall clock time to finding the optimal configuration. However, the total amount of compute needed (or DBUs) is typically more than what is needed if you run tuning serially. The reason is that a single serial tuning run is always able to access the entire prior (previous results), while with parallel runs the optimizer cannot know the outcome of the other concurrent runs still in progress when selecting new hyperparameter values to test.
timeout: The maximum number of seconds an
fmin()call can take. Once this number is exceeded, all runs are terminated and
fmin()would then exit. All information about completed runs is preserved based on which best model is selected. This argument can save you time as well as help you control your cluster cost.
SparkTrials API is included in the Example notebooks. To find it, search for
Here are a few things that help you get the most out of using Hyperopt:
- Bayesian approaches can be much more efficient than grid search and random search. Hence, with the Hyperopt Tree of Parzen Estimators (TPE) algorithm, it is often possible to explore more hyperparameters and larger ranges. However, if you can use domain knowledge to restrict the search domain which will help to speed up tuning and produce better results.
- For models with long training times, start experimenting with small datasets and as many hyperparameters as possible. Use MLflow to introspect the best performing models, make informed decisions about how to fix as many hyperparameters as you can, and intelligently down-scope the parameter space as you prepare for tuning at scale.
- Take advantage of Hyperopt support for conditional dimensions and hyperparameters. For example, when you evaluate multiple flavors of gradient descent, instead of limiting the hyperparameter space to just the common hyperparameters, you can have Hyperopt include conditional hyperparameters—the ones that are only appropriate for a subset of the flavors.
SparkTrials logs tuning results as nested MLflow runs as follows:
- Main or parent run: The call to
fmin()is logged as the “main” run. If there is an active run,
SparkTrialslogs under this active run and does not end the run when
fmin()returns. If there is no active run,
SparkTrialscreates a new run, logs under it, and ends the run before
- Child runs: Each hyperparameter setting tested (a “trial”) is logged as a child run under the main run.
fmin(), we recommend active MLflow run management; that is, wrap the call to
fmin() inside a
with mlflow.start_run(): statement.
This ensures that each
fmin() call is logged under its own MLflow “main” run, and it makes it easier to log extra tags, params, or metrics to that run.
fmin() is called multiple times within the same active MLflow run, it logs those multiple
fmin() calls to that same “main” run. To resolve name conflicts for MLflow params and tags, names with conflicts are mangled by appending a UUID.
- If the loss is
NaN(not a number), it is usually because the objective function passed to
NaNloss does not affect other runs and you can safely ignore it. If you want to avoid
NaNlosses, you can either adjust the hyperparameter space or modify your objective function.
- With Hyperopt search methods the loss usually does not decrease monotonically with each run. However, you can often find the best hyperparameters more quickly than using other methods.
- Both Hyperopt and Spark incur certain overheads. For short trial runs (low tens of seconds), these overheads dominate and the speedup could be pretty small or even zero.
- When you use
hp.choice, Hyperopt returns only the index of the choice list. Therefore the parameter logged in MLflow is also the index. You can use
hyperopt.space_evalto retrieve the parameter values.
The examples in this section demonstrate how to do hyperparameter tuning with Hyperopt.
Here is a notebook that shows distributed Hyperopt + automated MLflow tracking in action.
After you perform the actions in the last cell in the notebook, your MLflow UI should display:
Here is a notebook that demonstrates how to tune the hyperparameters for multiple models and arrive at a best model overall. We use Hyperopt with
SparkTrials to select between two model types: Naive Bayes and Support Vector Machines (SVM). For each model type, Hyperopt can search over a different set of hyperparameters.