Skip to main content

MLflow Tracing Integrations

MLflow Tracing is integrated with a wide array of popular Generative AI libraries and frameworks, offering a one-line automatic tracing experience for all of them. This allows you to gain immediate observability into your GenAI applications with minimal setup.

Automatic tracing captures your application's logic and intermediate steps, such as LLM calls, tool usage, and agent interactions, based on your implementation of the specific library or SDK.

For a deeper dive into how automatic tracing works, its prerequisites, and examples of combining it with manual tracing, please see the main Automatic Tracing guide. The quick examples below highlight some top integrations. Detailed guides for each supported library, covering prerequisites, advanced examples, and specific behaviors, are available on their respective pages in this section.

Top Integrations at a Glance

Here are quick-start examples for some of the most commonly used integrations. Click on a tab to see a basic usage example. For detailed prerequisites and more advanced scenarios for each, please visit their dedicated integration pages (linked from the tabs or the list below).

Python
import mlflow
import openai

# If running this code outside of a Databricks notebook (e.g., locally),
# uncomment and set the following environment variables to point to your Databricks workspace:
# import os
# os.environ["DATABRICKS_HOST"] = "https://your-workspace.cloud.databricks.com"
# os.environ["DATABRICKS_TOKEN"] = "your-personal-access-token"

# Enable auto-tracing for OpenAI
mlflow.openai.autolog()

# Set up MLflow tracking
mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/Shared/openai-tracing-demo")

openai_client = openai.OpenAI()

messages = [
{
"role": "user",
"content": "What is the capital of France?",
}
]

response = openai_client.chat.completions.create(
model="gpt-4o-mini",
messages=messages,
temperature=0.1,
max_tokens=100,
)
# View trace in MLflow UI

Full OpenAI Integration Guide

Secure API Key Management

For production environments, Databricks recommends that you use AI Gateway or Databricks secrets to manage API keys. AI Gateway is the preferred method and offers additional governance features.

warning

Never commit API keys directly in your code or notebooks. Always use AI Gateway or Databricks secrets for sensitive credentials.

Databricks recommends Mosaic AI Gateway for governing and monitoring access to gen AI models.

Create a Foundation Model endpoint configured with AI Gateway:

  1. In your Databricks workspace, go to Serving > Create new endpoint.
  2. Choose an endpoint type and provider.
  3. Configure the endpoint with your API key.
  4. During endpoint configuration, enable AI Gateway and configure rate limiting, fallbacks, and guardrails as needed.
  5. You can get autogenerated code to quickly start querying the endpoint. Go to Serving > your endpoint > Use > Query. Make sure to add the tracing code:
Python
import mlflow
from openai import OpenAI
import os

# How to get your Databricks token: https://docs.databricks.com/en/dev-tools/auth/pat.html
# DATABRICKS_TOKEN = os.environ.get('DATABRICKS_TOKEN')
# Alternatively in a Databricks notebook you can use this:
DATABRICKS_TOKEN = dbutils.notebook.entry_point.getDbutils().notebook().getContext().apiToken().get()

# Enable auto-tracing for OpenAI
mlflow.openai.autolog()

# Set up MLflow tracking (if running outside Databricks)
# If running in a Databricks notebook, these are not needed.
mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/Shared/my-genai-app")

client = OpenAI(
api_key=DATABRICKS_TOKEN,
base_url="<YOUR_HOST_URL>/serving-endpoints"
)

chat_completion = client.chat.completions.create(
messages=[
{
"role": "system",
"content": "You are an AI assistant"
},
{
"role": "user",
"content": "What is MLflow?"
}
],
model="<YOUR_ENDPOINT_NAME>",
max_tokens=256
)

print(chat_completion.choices[0].message.content)

Enabling Multiple Auto Tracing Integrations

As GenAI applications often combine multiple libraries, MLflow Tracing allows you to enable auto-tracing for several integrations simultaneously, providing a unified tracing experience.

For example, to enable both LangChain and direct OpenAI tracing:

Python
import mlflow

# Enable MLflow Tracing for both LangChain and OpenAI
mlflow.langchain.autolog()
mlflow.openai.autolog()

# Your code using both LangChain and OpenAI directly...
# ... an example can be found on the Automatic Tracing page ...

MLflow will generate a single, cohesive trace that combines steps from both LangChain and direct OpenAI LLM calls, allowing you to inspect the complete flow. More examples of combining integrations can be found on the Automatic Tracing page.

Disabling Auto Tracing

Auto tracing for any specific library can be disabled by calling mlflow.<library>.autolog(disable=True). To disable all autologging integrations at once, use mlflow.autolog(disable=True).

Python
import mlflow

# Disable for a specific library
mlflow.openai.autolog(disable=True)

# Disable all autologging
mlflow.autolog(disable=True)