MLflow 3.0 deep learning workflow (Beta)
This feature is in Beta.
Example notebook
The example notebook runs a single deep learning model training job with PyTorch, which is tracked as an MLflow run. It logs a checkpoint model after every 10 epochs. Each checkpoint is tracked as an MLflow LoggedModel. Using MLflow's UI or search API, you can inspect the checkpoint models and rank them by accuracy.
The notebook installs the scikit-learn
and torch
libraries.
MLflow 3.0 deep learning model with checkpoints notebook
Use the UI to register a model
After running the notebook, you can view the saved checkpoint models in the MLflow experiments UI. A link to the experiment appears in the notebook cell output, or follow these steps:
-
Click Experiments in the workspace sidebar.
-
Find your experiment in the experiments list. You can select the Only my experiments checkbox or use the Filter experiments search box to filter the list of experiments.
-
Click the name of your experiment. The Runs page opens. The experiment contains one MLflow run.
-
Click the Models tab. The individual checkpoint models are tracked on this screen. For each checkpoint, you can see the model's accuracy, along with all of its parameters and metadata.
In the example notebook, you registered the best performing model to Unity Catalog. You can also register a model from the UI. To do so, follow these steps:
-
From the Models tab, click the name of the model to register.
-
From the model details page, in the upper-right corner, click Register model.
-
Select Unity Catalog and either select an existing model name from the drop-down menu or type in a new name.
-
Click Register.
Use the API to rank checkpoint models
The following code shows how to rank the checkpoint models by accuracy.
ranked_checkpoints = mlflow.search_logged_models(output_format="list")
ranked_checkpoints.sort(
key=lambda model: next((metric.value for metric in model.metrics if metric.key == "accuracy"), float('-inf')),
reverse=True
)
best_checkpoint: mlflow.entities.LoggedModel = ranked_checkpoints[0]
print(best_checkpoint.metrics[0])
<Metric:
dataset_digest='9951783d',
dataset_name='train',
key='accuracy',
model_id='m-bba8fa52b6a6499281c43ef17fcdac84',
run_id='394928abe6fc4787aaf4e666ac89dc8a',
step=90,
timestamp=1730828771880,
value=0.9553571428571429
>
worst_checkpoint: mlflow.entities.LoggedModel = ranked_checkpoints[-1]
print(worst_checkpoint.metrics[0])
<Metric:
dataset_digest='9951783d',
dataset_name='train',
key='accuracy',
model_id='m-88885bc26de7492f908069cfe15a1499',
run_id='394928abe6fc4787aaf4e666ac89dc8a',
step=0,
timestamp=1730828730040,
value=0.35714285714285715
What's the difference between the Models tab on the MLflow experiment page and the model version page in Catalog Explorer?
The Models tab of the experiment page and the model version page in Catalog Explorer show similar information about the model. The two views have different roles in the model development and deployment lifecycle.
- The Models tab of the experiment page presents the results of logged models from an experiment on a single page. The Charts tab on this page provides visualizations to help you compare models and select the model versions to register to Unity Catalog for possible deployment.
- In Catalog Explorer, the model version page provides an overview of all model performance and evaluation results. This page shows model parameters, metrics, and traces across all linked environments including different workspaces, endpoints, and experiments. This is useful for monitoring and deployment, and works especially well with deployment jobs. The evaluation task in a deployment job creates additional metrics that appear on this page. The approver for the job can then review this page to assess whether to approve the model version for deployment.