Track machine learning training runs
The MLflow tracking component lets you log source properties, parameters, metrics, tags, and artifacts related to training a machine learning model. To get started with MLflow, try one of the MLflow quickstart tutorials.
MLflow tracking with experiments and runs
MLflow tracking is based on two concepts, experiments and runs:
An MLflow experiment is the primary unit of organization and access control for MLflow runs; all MLflow runs belong to an experiment. Experiments let you visualize, search for, and compare runs, as well as download run artifacts and metadata for analysis in other tools.
An MLflow run corresponds to a single execution of model code. Each run records the following information:
Source: Name of the notebook that launched the run or the project name and entry point for the run.
Version: Notebook revision if run from a notebook in a Databricks workspace, or Git commit hash if run from Databricks Repos or from an MLflow Project.
Start & end time: Start and end time of the run.
Parameters: Model parameters saved as key-value pairs. Both keys and values are strings.
Metrics: Model evaluation metrics saved as key-value pairs. The value is numeric. Each metric can be updated throughout the course of the run (for example, to track how your model’s loss function is converging), and MLflow records and lets you visualize the metric’s history.
Tags: Run metadata saved as key-value pairs. You can update tags during and after a run completes. Both keys and values are strings.
Artifacts: Output files in any format. For example, you can record images, models (for example, a pickled scikit-learn model), and data files (for example, a Parquet file) as an artifact.
The MLflow Tracking API logs parameters, metrics, tags, and artifacts from a model run. The Tracking API communicates with an MLflow tracking server. When you use Databricks, a Databricks-hosted tracking server logs the data. The hosted MLflow tracking server has Python, Java, and R APIs.
To learn how to control access to experiments, see MLflow Experiment permissions and Change permissions for experiment.
MLflow is installed on Databricks Runtime ML clusters. To use MLflow on a Databricks Runtime cluster, you must install the
mlflow library. For instructions on installing a library onto a cluster, see Install a library on a cluster. The specific packages to install for MLflow are:
For Python, select Library Source PyPI and enter
mlflowin the Package field.
For R, select Library Source CRAN and enter
mlflowin the Package field.
For Scala, install these two packages:
Select Library Source Maven and enter
org.mlflow:mlflow-client:1.11.0in the Coordinates field.
Select Library Source PyPI and enter
mlflowin the Package field.
Where MLflow runs are logged
All MLflow runs are logged to the active experiment, which can be set using any of the following ways:
Use the mlflow.set_experiment() command.
experiment_idparameter in the mlflow.start_run() command.
Set one of the MLflow environment variables MLFLOW_EXPERIMENT_NAME or MLFLOW_EXPERIMENT_ID.
If no active experiment is set, runs are logged to the notebook experiment.
To log your experiment results to a remotely hosted MLflow Tracking server in a workspace other than the one in which you are running your experiment, set the tracking URI to reference the remote workspace with
mlflow.set_tracking_uri(), and set the path to your experiment in the remote workspace by using
mlflow.set_tracking_uri(<uri_of_remote_workspace>) mlflow.set_experiment("path to experiment in remote workspace")
Logging example notebook
This notebook shows how to log runs to a notebook experiment and to a workspace experiment. Only MLflow runs initiated within a notebook can be logged to the notebook experiment. MLflow runs launched from any notebook or from the APIs can be logged to a workspace experiment. For information about viewing logged runs, see View notebook experiment and View workspace experiment.
You can use MLflow Python, Java or Scala, and R APIs to start runs and record run data. For details, see the MLflow quickstart notebooks.
Organize training runs with MLflow experiments
Experiments are units of organization for your model training runs. There are two types of experiments: workspace and notebook.
You can create a workspace experiment from the Databricks Machine Learning UI or the MLflow API. Workspace experiments are not associated with any notebook, and any notebook can log a run to these experiments by using the experiment ID or the experiment name.
A notebook experiment is associated with a specific notebook. Databricks automatically creates a notebook experiment if there is no active experiment when you start a run using mlflow.start_run().
To see all of the experiments in a workspace that you have access to, click Experiments in the sidebar. This icon appears only when you are in the machine learning persona.
To search for experiments, type text in the Search field and click Search. The experiment list changes to show only those experiments that contain the search text in the Name, Location, Created by, or Notes column.
Click the name of any experiment in the table to display its experiment page:
The experiment page lists all runs associated with the experiment. From the table, you can open the run page for any run associated with the experiment by clicking its Run Name. The Source column gives you access to the notebook version that created the run. You can also search and filter runs by metrics or parameter settings.
Create workspace experiment
This section describes how to create a workspace experiment using the Databricks UI. You can create a workspace experiment directly from the workspace or from the Experiments page.
You can also use the MLflow API, or the Databricks Terraform provider with databricks_mlflow_experiment.
For instructions on logging runs to workspace experiments, see Logging example notebook.
Click Workspace in the sidebar.
Go to the folder in which you want to create the experiment.
Do one of the following:
Next to any folder, click on the right side of the text and select Create > MLflow Experiment.
In the workspace or a user folder, click and select Create > MLflow Experiment.
In the Create MLflow Experiment dialog, enter a name for the experiment and an optional artifact location. If you do not specify an artifact location, artifacts are stored in
Databricks supports DBFS, S3, and Azure Blob storage artifact locations.
To store artifacts in S3, specify a URI of the form
s3://<bucket>/<path>. MLflow obtains credentials to access S3 from your clusters’s instance profile. Artifacts stored in S3 do not appear in the MLflow UI; you must download them using an object storage client.
The maximum size for an MLflow artifact uploaded to DBFS on AWS is 5GB.
To store artifacts in Azure Blob storage, specify a URI of the form
wasbs://<container>@<storage-account>.blob.core.windows.net/<path>. Artifacts stored in Azure Blob storage do not appear in the MLflow UI; you must download them using a blob storage client.
When you store an artifact in a location other than DBFS, the artifact does not appear in the MLflow UI. Models stored in locations other than DBFS cannot be registered in Model Registry.
Click Create. An empty experiment appears.
You can also create a new workspace experiment from the Experiments page. To create a new experiment, use the drop-down menu. From the drop-down menu, you can select either an AutoML experiment or a blank (empty) experiment.
AutoML experiment. The Configure AutoML experiment page appears. For information about using AutoML, see Train ML models with the Databricks AutoML UI.
Blank experiment. The Create MLflow Experiment dialog appears. Enter a name and optional artifact location in the dialog to create a new workspace experiment. The default artifact location is
To log runs to this experiment, call
mlflow.set_experiment()with the experiment path. The experiment path appears at the top of the experiment page. See Logging example notebook for details and an example notebook.
Create notebook experiment
When you use the mlflow.start_run() command in a notebook, the run logs metrics and parameters to the active experiment. If no experiment is active, Databricks creates a notebook experiment. A notebook experiment shares the same name and ID as its corresponding notebook. The notebook ID is the numerical identifier at the end of a Notebook URL and ID.
For instructions on logging runs to notebook experiments, see Logging example notebook.
If you delete a notebook experiment using the API (for example,
MlflowClient.tracking.delete_experiment() in Python), the notebook itself is moved into the Trash folder.
Each experiment that you have access to appears on the experiments page. From this page, you can view any experiment. Click on an experiment name to display the experiment page.
Additional ways to access the experiment page:
You can access the experiment page for a workspace experiment from the workspace menu.
You can access the experiment page for a notebook experiment from the notebook.
View workspace experiment
Click Workspace in the sidebar.
Go to the folder containing the experiment.
Click the experiment name.
View notebook experiment
In the notebook’s right sidebar, click the Experiment icon .
The Experiment Runs sidebar appears and shows a summary of each run associated with the notebook experiment, including run parameters and metrics. At the top of the sidebar is the name of the experiment that the notebook most recently logged runs to (either a notebook experiment or a workspace experiment).
From the sidebar, you can navigate to the experiment page or directly to a run.
To view the experiment, click at the far right, next to Experiment Runs.
To display a run, click the name of the run.
You can rename, delete, or manage permissions for an experiment you own from the experiments page, the experiment page, or the workspace menu.
Rename experiment from the experiments page or the experiment page
This feature is in Public Preview.
To rename an experiment from the experiments page or the experiment page, click and select Rename.
Copy experiment name
To copy the experiment name, click at the top of the experiment page. You can use this name in the MLflow command
set_experiment to set the active MLflow experiment.
You can also copy the experiment name from the experiment sidebar in a notebook.
Delete notebook experiment
Notebook experiments are part of the notebook and cannot be deleted separately. When you delete a notebook, the associated notebook experiment is deleted. When you delete a notebook experiment using the UI, the notebook is also deleted.
To delete notebook experiments using the API, use the Workspace API 2.0 to ensure both the notebook and experiment are deleted from the workspace.
Delete workspace or notebook experiment from the experiments page or the experiment page
This feature is in Public Preview.
To delete an experiment from the experiments page or the experiment page, click and select Delete.
When you delete a notebook experiment, the notebook is also deleted.
Change permissions for experiment
To change permissions for an experiment from the experiment page, click Share.
You can change permissions for an experiment that you own from the experiments page. Click in the Actions column and select Permission.
For more information about experiment permissions, see MLflow Experiment permissions.
Copy experiments between workspaces
To migrate MLflow experiments between workspaces, you can use the community-driven open source project MLflow Export-Import.
With these tools, you can:
Share and collaborate with other data scientists in the same or another tracking server. For example, you can clone an experiment from another user into your workspace.
Copy MLflow experiments and runs from your local tracking server to your Databricks workspace.
Back up mission critical experiments and models to another Databricks workspace.
Manage training code with MLflow runs
All MLflow runs are logged to the active experiment. If you have not explicitly set an experiment as the active experiment, runs are logged to the notebook experiment.
You can access a run either from its parent experiment page or directly from the notebook that created the run.
From the experiment page, in the runs table, click the start time of a run.
From the notebook, click next to the date and time of the run in the Experiment Runs sidebar.
The run screen shows the parameters used for the run, the metrics resulting from the run, and any tags or notes. To display Notes, Parameters, Metrics, or Tags for this run, click to the left of the label.
You also access artifacts saved from a run in this screen.
Code snippets for prediction
If you log a model from a run, the model appears in the Artifacts section of this page. To display code snippets illustrating how to load and use the model to make predictions on Spark and pandas DataFrames, click the model name.
View the notebook or Git project used for a run
To view the version of the notebook that created a run:
On the experiment page, click the link in the Source column.
On the run page, click the link next to Source.
From the notebook, in the Experiment Runs sidebar, click the Notebook icon in the box for that Experiment Run.
The version of the notebook associated with the run appears in the main window with a highlight bar showing the date and time of the run.
If the run was launched remotely from a Git project, click the link in the Git Commit field to open the specific version of the project used in the run. The link in the Source field opens the main branch of the Git project used in the run.
Add a tag to a run
Tags are key-value pairs that you can create and use later to search for runs.
From the run page, click if it is not already open. The tags table appears.
Click in the Name and Value fields and type the key and value for your tag.
Reproduce the software environment of a run
You can reproduce the exact software environment for the run by clicking Reproduce Run. The following dialog appears:
With the default settings, when you click Confirm:
The notebook is cloned to the location shown in the dialog.
If the original cluster still exists, the cloned notebook is attached to the original cluster and the cluster is started.
If the original cluster no longer exists, a new cluster with the same configuration, including any installed libraries, is created and started. The notebook is attached to the new cluster.
You can select a different location for the cloned notebook and inspect the cluster configuration and installed libraries:
To select a different folder to save the cloned notebook, click Edit Folder.
To see the cluster spec, click View Spec. To clone only the notebook and not the cluster, uncheck this option.
To see the libraries installed on the original cluster, click View Libraries. If you don’t care about installing the same libraries as on the original cluster, uncheck this option.
You can search for runs based on parameter or metric values. You can also search for runs by tag.
To search for runs that match an expression containing parameter and metric values, enter a query in the search field and click Search. Some query syntax examples are:
metrics.r2 > 0.3
params.elasticNetParam = 0.5
params.elasticNetParam = 0.5 AND metrics.avg_areaUnderROC > 0.3
To search for runs by tag, enter tags in the format:
tags.<key>="<value>". String values must be enclosed in quotes as shown.
tags.color="blue" AND tags.size=5
Both keys and values can contain spaces. If the key includes spaces, you must enclose it in backticks as shown.
tags.`my custom tag` = "my value"
You can also filter runs based on their state (Active or Deleted) and based on whether a model version is associated with the run. To do this, make your selections from the State and Time Created drop-down menus respectively.
You can compare runs from a single experiment or from multiple experiments. The Comparing Runs page presents information about the selected runs in graphic and tabular formats.
Compare runs from a single experiment
On the experiment page, select two or more runs by clicking in the checkbox to the left of the run, or select all runs by checking the box at the top of the column.
Click Compare. The Comparing `<N>` Runs screen appears.
Compare runs from multiple experiments
On the experiments page, select the experiments you want to compare by clicking in the box at the left of the experiment name.
Click Compare (n) (n is the number of experiments you selected). A screen appears showing all of the runs from the experiments you selected.
Select two or more runs by clicking in the checkbox to the left of the run, or select all runs by checking the box at the top of the column.
Click Compare. The Comparing `<N>` Runs screen appears.
Use the Comparing Runs page
The Comparing Runs page shows visualizations of run results and tables of run information, run parameters, and metrics.
To create a visualization:
Select the plot type (Parallel Coordinates Plot, Scatter Plot, or Contour Plot).
For a Parallel Coordinates Plot, select the parameters and metrics to plot. From here, you can identify relationships between the selected parameters and metrics, which helps you better define the hyperparameter tuning space for your models.
For a Scatter Plot or Contour Plot, select the parameter or metric to display on each axis.
The Parameters and Metrics tables display the run parameters and metrics from all selected runs. The columns in these tables are identified by the Run details table immediately above. For simplicity, you can hide parameters and metrics that are identical in all selected runs by toggling .
Select one or more runs.
Click Download CSV. A CSV file containing the following fields downloads:
Run ID,Name,Source Type,Source Name,User,Status,<parameter1>,<parameter2>,...,<metric1>,<metric2>,...
In the experiment, select one or more runs by clicking in the checkbox to the left of the run.
If the run is a parent run, decide whether you also want to delete descendant runs. This option is selected by default.
Click Delete to confirm or Cancel to cancel. Deleted runs are saved for 30 days. To display deleted runs, select Deleted in the State field.
Copy runs between workspaces
To import or export MLflow runs to or from your Databricks workspace, you can use the community-driven open source project MLflow Export-Import.
Access the MLflow tracking server from outside Databricks
You can also write to and read from the tracking server from outside Databricks, for example using the MLflow CLI.
Analyze MLflow runs programmatically
You can access MLflow run data programmatically using the following two DataFrame APIs:
The MLflow Python client search_runs API returns a pandas DataFrame.
The MLflow experiment data source returns an Apache Spark DataFrame.
This example demonstrates how to use the MLflow Python client to build a dashboard that visualizes changes in evaluation metrics over time, tracks the number of runs started by a specific user, and measures the total number of runs across all users:
Why model training metrics and outputs may vary
Many of the algorithms used in ML have a random element, such as sampling or random initial conditions within the algorithm itself. When you train a model using one of these algorithms, the results might not be the same with each run, even if you start the run with the same conditions. Many libraries offer a seeding mechanism to fix the initial conditions for these stochastic elements. However, there may be other sources of variation that are not controlled by seeds. Some algorithms are sensitive to the order of the data, and distributed ML algorithms may also be affected by how the data is partitioned. Generally this variation is not significant and not important in the model development process.
To control variation caused by differences in ordering and partitioning, use the PySpark functions repartition and sortWithinPartitions.
MLflow tracking examples
The following notebooks demonstrate how to train several types of models and track the training data in MLflow and how to store tracking data in Delta Lake.