Skip to main content

Open source vs. managed MLflow on Databricks

This page is meant to help open source MLflow users get familiar with using MLflow on Databricks. Databricks-managed MLflow uses the same APIs but provides additional capabilities through integrations with the broader Databricks platform.

Benefits of managed MLflow on Databricks

Open source MLflow provides the core data model, API, and SDK. This means your data and workloads are always portable.

Managed MLflow on Databricks adds:

  • Enterprise-grade governance and security through integration with the Databricks platform, Lakehouse, and Unity Catalog. Your AI and ML data, tools, agents, models, and other assets can be governed and used in the same platform as the rest of your data and workloads.
  • Fully managed hosting on production-ready, scalable servers
  • Integrations for development and production with the broader Mosaic AI platform

See the Managed MLflow product page for more details on benefits, and see the rest of this page to learn about technical details.

tip

Your data is always yours - The core data model and APIs are completely open source. You can export and use your MLflow data anywhere.

Additional capabilities on Databricks

This section lists important capabilities enabled on managed MLflow through integrations with the broader Databricks platform. For overviews of all capabilities of MLflow for GenAI, see MLflow 3 for GenAI and the open source GenAI documentation.

Enterprise-grade governance and security

  • Enterprise governance with Unity Catalog: Models, feature tables, vector indexes, tools, and more are governed centrally under Unity Catalog. When deploying agents, authentication for agent, data, and tool access can be precisely controlled using both authentication passthrough and on-behalf-of-user authentication.
  • Lakehouse data integration: Leverage AI/BI Genie spaces and dashboards and Databricks SQL to analyze logs and traces from MLflow experiments.
  • Security and management: MLflow permissions follow the same governance patterns as the broader Databricks platform:
  • Auditing: System tables provide usage and audit logs for managed MLflow.

Fully managed hosting on production-ready servers

  • Fully managed: Databricks provides MLflow servers with automatic updates, designed for scalability and production. For details, see Resource limits.
  • Trusted platform: Managed MLflow is used by thousands of customers across the globe.

Integrations for development and production

Development of AI and ML is streamlined by integrations such as:

Production AI and ML are facilitated by integrations such as:

note

Open source telemetry collection was introduced in MLflow 3.2.0, and is disabled on Databricks by default. For more details, refer to the MLflow usage tracking documentation.

Next steps

Get started with MLflow on Databricks:

Related reference material: