ML lifecycle management using MLflow

This article describes how MLflow is used in Databricks for machine learning lifecycle management. It also includes examples that introduce each MLflow component and links to content that describe how these components are hosted within Databricks.

ML lifecycle management in Databricks is provided by managed MLflow. Databricks provides a fully managed and hosted version of MLflow integrated with enterprise security features, high availability, and other Databricks workspace features such as experiment and run management and notebook revision capture.

First-time users should begin with Get started with MLflow experiments, which demonstrates the basic MLflow tracking APIs.

What is MLflow?

MLflow is an open source platform for managing the end-to-end machine learning lifecycle. It has the following primary components:

  • Tracking: Allows you to track experiments to record and compare parameters and results.

  • Models: Allow you to manage and deploy models from a variety of ML libraries to a variety of model serving and inference platforms.

  • Projects: Allow you to package ML code in a reusable, reproducible form to share with other data scientists or transfer to production.

  • Model Registry: Allows you to centralize a model store for managing models’ full lifecycle stage transitions: from staging to production, with capabilities for versioning and annotating. Databricks provides a managed version of the Model Registry in Unity Catalog.

  • Model Serving: Allows you to host MLflow models as REST endpoints. Databricks provides a unified interface to deploy, govern, and query your served AI models.

MLflow supports Java, Python, R, and REST APIs.

Note

If you’re just getting started with Databricks, consider using MLflow on Databricks Community Edition, which provides a simple managed MLflow experience for lightweight experimentation. Remote execution of MLflow projects is not supported on Databricks Community Edition. We plan to impose moderate limits on the number of experiments and runs. For the initial launch of MLflow on Databricks Community Edition no limits are imposed.

MLflow data stored in the control plane (experiment runs, metrics, tags and params) is encrypted using a platform-managed key. Encryption using Customer-managed keys for managed services is not supported for that data. On the other hand, the MLflow models and artifacts stored in your root (DBFS) storage can be encrypted using your own key by configuring customer-managed keys for workspace storage.

MLflow tracking

MLflow on Databricks offers an integrated experience for tracking and securing training runs for machine learning and deep learning models.

Model lifecycle management

MLflow Model Registry is a centralized model repository and a UI and set of APIs that enable you to manage the full lifecycle of MLflow Models. Databricks provides a hosted version of the MLflow Model Registry in Unity Catalog. Unity Catalog provides centralized model governance, cross-workspace access, lineage, and deployment. For details about managing the model lifecycle in Unity Catalog, see Manage model lifecycle in Unity Catalog.

If your workspace is not enabled for Unity Catalog, you can use the Workspace Model Registry.

Model Registry concepts

  • Model: An MLflow Model logged from an experiment or run that is logged with one of the model flavor’s mlflow.<model-flavor>.log_model methods. After a model is logged, you can register it with the Model Registry.

  • Registered model: An MLflow Model that has been registered with the Model Registry. The registered model has a unique name, versions, model lineage, and other metadata.

  • Model version: A version of a registered model. When a new model is added to the Model Registry, it is added as Version 1. Each model registered to the same model name increments the version number.

  • Model alias: An alias is a mutable, named reference to a particular version of a registered model. Typical uses of aliases are to specify which model versions are deployed in a given environment in your model training workflows or to write inference workloads that target a specific alias. For example, you could assign the “Champion” alias of your “Fraud Detection” registered model to the model version that should serve the majority of production traffic, and then write inference workloads that target that alias (that is, make predictions using the “Champion” version).

  • Model stage (workspace model registry only): A model version can be assigned one or more stages. MLflow provides predefined stages for the common use cases: None, Staging, Production, and Archived. With the appropriate permission you can transition a model version between stages or you can request a model stage transition. Model version stages are not used in Unity Catalog.

  • Description: You can annotate a model’s intent, including a description and any relevant information useful for the team such as algorithm description, dataset employed, or methodology.

Example notebooks

For an example that illustrates how to use the Model Registry to build a machine learning application that forecasts the daily power output of a wind farm, see the following:

Model deployment

Databricks Model Serving provides a unified interface to deploy, govern, and query AI models. Each model you serve is available as a REST API that you can integrate into your web or client application.

Model serving supports serving:

  • Custom models. These are Python models packaged in the MLflow format. They can be registered either in Unity Catalog or in the workspace model registry. Examples include scikit-learn, XGBoost, PyTorch, and Hugging Face transformer models.

  • State-of-the-art open models made available by Foundation Model APIs. These models are curated foundation model architectures that support optimized inference. Base models, like Llama-2-70B-chat, BGE-Large, and Mistral-7B are available for immediate use with pay-per-token pricing, and workloads that require performance guarantees and fine-tuned model variants can be deployed with provisioned throughput.

  • External models. These are models that are hosted outside of Databricks. Examples include foundation models like, OpenAI’s GPT-4, Anthropic’s Claude, and others. Endpoints that serve external models can be centrally governed and customers can establish rate limits and access controls for them.

You also can deploy MLflow models for offline inference, see Deploy models for batch inference.