Get started with MLflow 3.0 (Beta)
This feature is in Beta.
This article gets you started with MLflow 3.0. It describes how to install MLflow 3.0 and includes several demo notebooks to get started. It also includes links to pages that cover the new features of MLflow 3.0 in more detail.
What is MLflow 3.0 and how is it different from the existing MLflow version?
MLflow 3.0 on Databricks delivers state-of-the-art experiment tracking, observability, and performance evaluation for machine learning models, generative AI applications, and agents on the Databricks lakehouse. Using MLflow 3.0 on Databricks, you can:
-
Centrally track and analyze the performance of your models, AI applications, and agents across all environments, from interactive queries in a development notebook through production batch or real-time serving deployments.
-
Orchestrate evaluation and deployment workflows using Unity Catalog and access comprehensive status logs for each version of your model, AI application, or agent.
-
View and access model metrics and parameters from the model version page in Unity Catalog and from the REST API.
-
Annotate requests and responses (traces) for all of your gen AI applications and agents, enabling human experts and automated techniques (such as LLM-as-a-judge) to provide rich feedback. You can leverage this feedback to assess and compare the performance of application versions and to build datasets for improving quality.
These capabilities simplify and streamline evaluation, deployment, debugging, and monitoring for all of your AI initiatives.
Much of the new functionality of MLflow 3.0 derives from the new concept of a LoggedModel
. LoggedModel
s are produced from MLflow runs. Runs are an existing concept in MLflow and can be thought of as jobs that execute model code. Training runs produce models as outputs, and evaluation runs use existing models as input to produce metrics and other information you can use to assess the performance of a model. In MLflow 3.0, the concept of a model produced by a run has been separated into its own dedicated object called LoggedModel
. A LoggedModel
is used to track the model lifecycle across different runs including training and evaluation runs. For more details, see Track and compare models using MLflow Logged Models (Beta).
MLflow 3.0 also introduces the concept of a Deployment Job. Deployment Jobs use Databricks Jobs to manage the model lifecycle, including steps like evaluation, approval, and deployment. These model workflows are governed by Unity Catalog, and all events are saved to an activity log that is available on the model version page in Unity Catalog.
Install MLflow 3.0
To use MLflow 3.0, you must install the wheel. The following lines of code must be executed each time a notebook is run:
%pip install mlflow --upgrade --pre
dbutils.library.restartPython()
Example notebooks
The following pages illustrate the MLflow 3.0 model tracking workflow for traditional ML, deep learning, and gen AI. Each page includes an example notebook.
- MLflow 3.0 traditional ML workflow (Beta).
- MLflow 3.0 deep learning workflow (Beta).
- MLflow 3.0 generative AI workflow (Beta).
Next steps
To learn more about the new features of MLflow 3.0, see the following articles: