MLflow Tracing 統合

MLflow Tracing は、さまざまな一般的な生成AI ライブラリやフレームワークと統合されており、それらすべてに対して 1 行の自動トレース エクスペリエンスを提供します。これにより、最小限のセットアップで生成AIアプリケーションの可観測性を即座に得ることができます。

この幅広いサポートにより、既に使用しているツールを活用して、大幅なコード変更を行わなくても観測性を高めることができます。カスタムコンポーネントまたはサポートされていないライブラリの場合、 MLflow強力な手動トレースAPIs提供します。

自動トレースは、特定のライブラリまたは SDK の実装に基づいて、アプリケーションのロジックと中間ステップ (LLM 呼び出し、ツールの使用、エージェントの対話など) をキャプチャします。

注記

サーバレスコンピュートクラスタでは、生成 AI トレースフレームワークの自動ログは自動的に有効になりません。トレースする特定の統合に対して適切な mlflow.<library>.autolog() 関数を呼び出して、自動ログを明示的に有効にする必要があります。

トップの統合の概要

ここでは、最も一般的に使用される統合のクイックスタートの例を示します。タブをクリックすると、基本的な使用例が表示されます。それぞれの詳細な前提条件とより高度なシナリオについては、専用の統合ページ (以下のタブまたは一覧からリンクされています) を参照してください。

Python
import mlflow
import openai

# If running this code outside of a Databricks notebook (e.g., locally),
# uncomment and set the following environment variables to point to your Databricks workspace:
# import os
# os.environ["DATABRICKS_HOST"] = "https://your-workspace.cloud.databricks.com"
# os.environ["DATABRICKS_TOKEN"] = "your-personal-access-token"

# Enable auto-tracing for OpenAI
mlflow.openai.autolog()

# Set up MLflow tracking
mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/Shared/openai-tracing-demo")

openai_client = openai.OpenAI()

messages = [
    {
        "role": "user",
        "content": "What is the capital of France?",
    }
]

response = openai_client.chat.completions.create(
    model="gpt-4o-mini",
    messages=messages,
    temperature=0.1,
    max_tokens=100,
)
# View trace in MLflow UI

OpenAI統合ガイド

Python
import mlflow
from langchain.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_openai import ChatOpenAI

# If running this code outside of a Databricks notebook (e.g., locally),
# uncomment and set the following environment variables to point to your Databricks workspace:
# import os
# os.environ["DATABRICKS_HOST"] = "https://your-workspace.cloud.databricks.com"
# os.environ["DATABRICKS_TOKEN"] = "your-personal-access-token"

mlflow.langchain.autolog()

mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/Shared/langchain-tracing-demo")

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0.7, max_tokens=1000)
prompt = PromptTemplate.from_template("Tell me a joke about {topic}.")
chain = prompt | llm | StrOutputParser()

chain.invoke({"topic": "artificial intelligence"})
# View trace in MLflow UI

完全なLangChain統合ガイド

Python
import mlflow
from langchain_core.tools import tool
from langchain_openai import ChatOpenAI
from langgraph.prebuilt import create_react_agent

# If running this code outside of a Databricks notebook (e.g., locally),
# uncomment and set the following environment variables to point to your Databricks workspace:
# import os
# os.environ["DATABRICKS_HOST"] = "https://your-workspace.cloud.databricks.com"
# os.environ["DATABRICKS_TOKEN"] = "your-personal-access-token"

mlflow.langchain.autolog() # LangGraph uses LangChain's autolog

mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/Shared/langgraph-tracing-demo")

@tool
def get_weather(city: str):
    """Use this to get weather information."""
    return f"It might be cloudy in {city}"

llm = ChatOpenAI(model="gpt-4o-mini")
graph = create_react_agent(llm, [get_weather])
result = graph.invoke({"messages": [("user", "what is the weather in sf?")]})
# View trace in MLflow UI

LangGraph統合ガイド

Python
import mlflow
import anthropic
import os

# If running this code outside of a Databricks notebook (e.g., locally),
# uncomment and set the following environment variables to point to your Databricks workspace:
# os.environ["DATABRICKS_HOST"] = "https://your-workspace.cloud.databricks.com"
# os.environ["DATABRICKS_TOKEN"] = "your-personal-access-token"

mlflow.anthropic.autolog()

mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/Shared/anthropic-tracing-demo")

client = anthropic.Anthropic(api_key=os.environ.get("ANTHROPIC_API_KEY"))
message = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello, Claude"}],
)
# View trace in MLflow UI

完全なAnthropic統合ガイド

Python
import mlflow
import dspy

# If running this code outside of a Databricks notebook (e.g., locally),
# uncomment and set the following environment variables to point to your Databricks workspace:
# import os
# os.environ["DATABRICKS_HOST"] = "https://your-workspace.cloud.databricks.com"
# os.environ["DATABRICKS_TOKEN"] = "your-personal-access-token"

mlflow.dspy.autolog()

mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/Shared/dspy-tracing-demo")

lm = dspy.LM("openai/gpt-4o-mini") # Assumes OPENAI_API_KEY is set
dspy.configure(lm=lm)

class SimpleSignature(dspy.Signature):
    input_text: str = dspy.InputField()
    output_text: str = dspy.OutputField()

program = dspy.Predict(SimpleSignature)
result = program(input_text="Summarize MLflow Tracing.")
# View trace in MLflow UI

完全なDSPy統合ガイド

Python
import mlflow
import os
from openai import OpenAI # Databricks FMAPI uses OpenAI client

# If running this code outside of a Databricks notebook (e.g., locally),
# uncomment and set the following environment variables to point to your Databricks workspace:
# os.environ["DATABRICKS_HOST"] = "https://your-workspace.cloud.databricks.com"
# os.environ["DATABRICKS_TOKEN"] = "your-personal-access-token"

mlflow.openai.autolog() # Traces Databricks FMAPI using OpenAI client

mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/Shared/databricks-fmapi-tracing")

client = OpenAI(
    api_key=os.environ.get("DATABRICKS_TOKEN"),
    base_url=f"{os.environ.get('DATABRICKS_HOST')}/serving-endpoints"
)
response = client.chat.completions.create(
    model="databricks-llama-4-maverick",
    messages=[{"role": "user", "content": "Key features of MLflow?"}],
)
# View trace in MLflow UI

Databricks統合の完全ガイド

Python
import mlflow
import boto3

# If running this code outside of a Databricks notebook (e.g., locally),
# uncomment and set the following environment variables to point to your Databricks workspace:
# import os
# os.environ["DATABRICKS_HOST"] = "https://your-workspace.cloud.databricks.com"
# os.environ["DATABRICKS_TOKEN"] = "your-personal-access-token"

mlflow.bedrock.autolog()

mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/Shared/bedrock-tracing-demo")

bedrock = boto3.client(
    service_name="bedrock-runtime",
    region_name="us-east-1" # Replace with your region
)
response = bedrock.converse(
    modelId="anthropic.claude-3-5-sonnet-20241022-v2:0",
    messages=[{"role": "user", "content": "Hello World in one line."}]
)
# View trace in MLflow UI

完全なBedrock統合ガイド

Python
import mlflow
from autogen import ConversableAgent
import os

# If running this code outside of a Databricks notebook (e.g., locally),
# uncomment and set the following environment variables to point to your Databricks workspace:
# os.environ["DATABRICKS_HOST"] = "https://your-workspace.cloud.databricks.com"
# os.environ["DATABRICKS_TOKEN"] = "your-personal-access-token"

mlflow.autogen.autolog()

mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/Shared/autogen-tracing-demo")

config_list = [{"model": "gpt-4o-mini", "api_key": os.environ.get("OPENAI_API_KEY")}]
assistant = ConversableAgent("assistant", llm_config={&quot;config_list&quot;: config_list})
user_proxy = ConversableAgent("user_proxy", human_input_mode="NEVER", code_execution_config=False)

user_proxy.initiate_chat(assistant, message="What is 2+2?")
# View trace in MLflow UI

AutoGen 統合ガイド

安全なAPIキー管理

本番運用環境の場合、Databricks AIGateway またはシークレットを使用してDatabricks APIキーを管理することをお勧めします。AI Gateway は推奨される方法であり、追加のガバナンス機能を提供します。

警告

API キーをコードやノートブックに直接コミットしないでください。機密性の高い資格情報には、常に AI Gateway または Databricks のシークレットを使用してください。

AI Gateway (Recommended)
Databricks secrets

Databricks 、生成AIモデルへのアクセスを管理および監視するためにAI Gatewayを推奨しています。

AI Gateway で構成された基盤モデルエンドポイントを作成します。

Databricks ワークスペースで、「 サービス提供 」> 「新しいエンドポイントの作成 」に移動します。
エンドポイントのタイプとプロバイダーを選択します。
API キーを使用してエンドポイントを構成します。
エンドポイントの設定時に、 AI Gateway を有効にし、必要に応じてレート制限、フォールバック、ガードレールを設定します。
自動生成されたコードを取得して、エンドポイントのクエリをすばやく開始できます。エンドポイント> Serving > Query > に移動します 。トレースコードを必ず追加してください。

Python
import mlflow
from openai import OpenAI
import os

# How to get your Databricks token: https://docs.databricks.com/en/dev-tools/auth/pat.html
# DATABRICKS_TOKEN = os.environ.get('DATABRICKS_TOKEN')
# Alternatively in a Databricks notebook you can use this:
DATABRICKS_TOKEN = dbutils.notebook.entry_point.getDbutils().notebook().getContext().apiToken().get()

# Enable auto-tracing for OpenAI
mlflow.openai.autolog()

# Set up MLflow tracking (if running outside Databricks)
# If running in a Databricks notebook, these are not needed.
mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/Shared/my-genai-app")

client = OpenAI(
  api_key=DATABRICKS_TOKEN,
  base_url="<YOUR_HOST_URL>/serving-endpoints"
)

chat_completion = client.chat.completions.create(
  messages=[
  {
    "role": "system",
    "content": "You are an AI assistant"
  },
  {
    "role": "user",
    "content": "What is MLflow?"
  }
  ],
  model="<YOUR_ENDPOINT_NAME>",
  max_tokens=256
)

print(chat_completion.choices[0].message.content)

Databricks シークレットを使用して API キーを管理します。

シークレットスコープを作成し、APIキーを保存します。

Python
from databricks.sdk import WorkspaceClient

# Set your secret scope and key names
secret_scope_name = "llm-secrets"  # Choose an appropriate scope name
secret_key_name = "api-key"        # Choose an appropriate key name

# Create the secret scope and store your API key
w = WorkspaceClient()
w.secrets.create_scope(scope=secret_scope_name)
w.secrets.put_secret(
    scope=secret_scope_name,
    key=secret_key_name,
    string_value="your-api-key-here"  # Replace with your actual API key
)

コード内のシークレットを取得して使用します。

Python
import mlflow
import openai
import os

# Configure your secret scope and key names
secret_scope_name = "llm-secrets"
secret_key_name = "api-key"

# Retrieve the API key from Databricks secrets
os.environ["OPENAI_API_KEY"] = dbutils.secrets.get(
    scope=secret_scope_name,
    key=secret_key_name
)

# Enable automatic tracing
mlflow.openai.autolog()

# Use OpenAI client with securely managed API key
client = openai.OpenAI()
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Explain MLflow Tracing"}],
    max_tokens=100
)

複数の自動トレース統合の有効化

生成AIアプリケーションは複数のライブラリを組み合わせることが多いため、 MLflow Tracing では複数の統合に対して同時に自動トレースを有効にし、統一されたトレースエクスペリエンスを提供できます。

たとえば、LangChain と直接 OpenAI トレースの両方を有効にするには、次のようにします。

Python
import mlflow

# Enable MLflow Tracing for both LangChain and OpenAI
mlflow.langchain.autolog()
mlflow.openai.autolog()

# Your code using both LangChain and OpenAI directly...
# ... an example can be found on the Automatic Tracing page ...

MLflow は、LangChain と直接 OpenAI LLM 呼び出しの両方のステップを組み合わせた 1 つのまとまりのあるトレースを生成するため、フロー全体を検査できます。統合の組み合わせのその他の例については、自動トレースのページを参照してください。

自動トレースの無効化

特定のライブラリの自動トレースを無効にするには、 mlflow.<library>.autolog(disable=True).すべての自動ログ統合を一度に無効にするには、 mlflow.autolog(disable=True)を使用します。

Python
import mlflow

# Disable for a specific library
mlflow.openai.autolog(disable=True)

# Disable all autologging
mlflow.autolog(disable=True)

トップの統合の概要​

安全なAPIキー管理​

複数の自動トレース統合の有効化​

自動トレースの無効化​

トップの統合の概要

安全なAPIキー管理

複数の自動トレース統合の有効化

自動トレースの無効化