Hyperparameter tuning and AutoML

AutoML (automated machine learning) refers to automating the process of developing machine learning models. Databricks Runtime for Machine Learning incorporates MLflow and Hyperopt, two open source tools that automate the process of model selection and hyperparameter tuning.

Automated MLflow tracking

MLflow is an open source platform for managing the end-to-end machine learning lifecycle. For AutoML, MLflow provides automated tracking for model tuning with Apache Spark MLlib. With automated MLflow tracking, when you run tuning code using CrossValidator or TrainValidationSplit, the specified hyperparameters and evaluation metrics are automatically logged, making it easy to identify the optimal model. Automated MLflow tracking is available for Python notebooks only.

Hyperparameter tuning with Hyperopt

Hyperopt is an open-source library that facilitates distributed hyperparameter tuning. You can also automatically search across different model architectures using conditional parameters.

Hyperopt optimizes a scalar-valued objective function over a set of input parameters to that function. When using Hyperopt to do hyperparameter tuning for your machine learning models, you define the objective function to take hyperparameters of interest as input and output a training or validation loss. In the objective function, you load the training data, train your machine learning model with hyperparameters received from the input and save model checkpoints every several iterations as usual. Hyperopt offers two tuning algorithms—Random Search and the Bayesian method Tree of Parzen Estimators (TPE)—which offer improved compute efficiency compared to a brute force approach such as grid search.

There are two ways to use Hyperopt in a distributed setting:

  • Use distributed Hyperopt with single-machine training algorithms. Specifically, you use the SparkTrials class when calling hyperopt.fmin() and run single-machine training algorithms in the objective function.
  • Use single-machine Hyperopt with distributed training algorithms. Specifically, you use the default base.Trials class when calling hyperopt.fmin() and run distributed training algorithms in the objective function.

See the following articles for detailed demonstrations of these two use cases:

The following end-to-end example notebook uses Hyperopt and SparkTrials to run a parallel hyperparameter sweep to train multiple models in parallel. This example also tracks the performance of each parameter configuration using MLflow.