Track machine learning training runs

The MLflow tracking component lets you log source properties, parameters, metrics, tags, and artifacts related to training a machine learning model. To get started with MLflow, try one of the MLflow quickstart tutorials.

MLflow tracking with experiments and runs

MLflow tracking is based on two concepts, experiments and runs:

  • An MLflow experiment is the primary unit of organization and access control for MLflow runs; all MLflow runs belong to an experiment. Experiments let you visualize, search for, and compare runs, as well as download run artifacts and metadata for analysis in other tools.

  • An MLflow run corresponds to a single execution of model code.

The MLflow Tracking API logs parameters, metrics, tags, and artifacts from a model run. The Tracking API communicates with an MLflow tracking server. When you use Databricks, a Databricks-hosted tracking server logs the data. The hosted MLflow tracking server has Python, Java, and R APIs.

To learn how to control access to experiments, see MLflow Experiment permissions and Change permissions for experiment.

Note

MLflow is installed on Databricks Runtime ML clusters. To use MLflow on a Databricks Runtime cluster, you must install the mlflow library. For instructions on installing a library onto a cluster, see Install a library on a cluster. The specific packages to install for MLflow are:

  • For Python, select Library Source PyPI and enter mlflow in the Package field.

  • For R, select Library Source CRAN and enter mlflow in the Package field.

  • For Scala, install these two packages:

    • Select Library Source Maven and enter org.mlflow:mlflow-client:1.11.0 in the Coordinates field.

    • Select Library Source PyPI and enter mlflow in the Package field.

Where MLflow runs are logged

All MLflow runs are logged to the active experiment, which can be set using any of the following ways:

If no active experiment is set, runs are logged to the notebook experiment.

To log your experiment results to a remotely hosted MLflow Tracking server in a workspace other than the one in which you are running your experiment, set the tracking URI to reference the remote workspace with mlflow.set_tracking_uri(), and set the path to your experiment in the remote workspace by using mlflow.set_experiment().

mlflow.set_tracking_uri(<uri-of-remote-workspace>)
mlflow.set_experiment("path to experiment in remote workspace")

Logging example notebook

This notebook shows how to log runs to a notebook experiment and to a workspace experiment. Only MLflow runs initiated within a notebook can be logged to the notebook experiment. MLflow runs launched from any notebook or from the APIs can be logged to a workspace experiment. For information about viewing logged runs, see View notebook experiment and View workspace experiment.

Log MLflow runs notebook

Open notebook in new tab

You can use MLflow Python, Java or Scala, and R APIs to start runs and record run data. For details, see the MLflow quickstart notebooks.

Access the MLflow tracking server from outside Databricks

You can also write to and read from the tracking server from outside Databricks, for example using the MLflow CLI.

Analyze MLflow runs programmatically

You can access MLflow run data programmatically using the following two DataFrame APIs:

This example demonstrates how to use the MLflow Python client to build a dashboard that visualizes changes in evaluation metrics over time, tracks the number of runs started by a specific user, and measures the total number of runs across all users:

Why model training metrics and outputs may vary

Many of the algorithms used in ML have a random element, such as sampling or random initial conditions within the algorithm itself. When you train a model using one of these algorithms, the results might not be the same with each run, even if you start the run with the same conditions. Many libraries offer a seeding mechanism to fix the initial conditions for these stochastic elements. However, there may be other sources of variation that are not controlled by seeds. Some algorithms are sensitive to the order of the data, and distributed ML algorithms may also be affected by how the data is partitioned. Generally this variation is not significant and not important in the model development process.

To control variation caused by differences in ordering and partitioning, use the PySpark functions repartition and sortWithinPartitions.

MLflow tracking examples

The following notebooks demonstrate how to train several types of models and track the training data in MLflow and how to store tracking data in Delta Lake.