アプリケーションバージョンとともにプロンプトバージョンを追跡する

備考

ベータ版

この機能はベータ版です。

このガイドでは、MLflow プロンプトレジストリのプロンプトを生成AI アプリケーションに統合し、プロンプトとアプリケーションの両方のバージョンを一緒に追跡する方法を示します。レジストリからのプロンプトで mlflow.set_active_model() を使用すると、 MLflow によってプロンプトバージョンとアプリケーションバージョンの間にリネージが自動的に作成されます。

学習内容:

アプリケーションで MLflow プロンプトレジストリからプロンプトを読み込んで使用する
アプリケーションのバージョンを追跡: LoggedModels
プロンプトバージョンとアプリケーションバージョン間の自動リネージの表示
プロンプトを更新し、変更がアプリケーションにどのように反映されるかを確認します

前提条件

MLflow と必要なパッケージをインストールする
Bash
```
pip install --upgrade "mlflow[databricks]>=3.1.0" openai
```
MLflow エクスペリメントを作成するには、環境のセットアップに関するクイックスタートに従ってください。
プロンプトレジストリを使用するための MANAGE アクセス許可を持つ Unity Catalog スキーマにアクセスできることを確認します。

手順 1: レジストリにプロンプトを作成する

まず、アプリケーションで使用するプロンプトを作成しましょう。プロンプトの作成と編集のガイドに従ってプロンプトを既に作成している場合は、この手順をスキップできます。

Python
import mlflow

# Replace with a Unity Catalog schema where you have MANAGE permission
uc_schema = "workspace.default"
prompt_name = "customer_support_prompt"

# Define the prompt template with variables
initial_template = """\
You are a helpful customer support assistant for {{company_name}}.

Please help the customer with their inquiry about: {{topic}}

Customer Question: {{question}}

Provide a friendly, professional response that addresses their concern.
"""

# Register a new prompt
prompt = mlflow.genai.register_prompt(
    name=f"{uc_schema}.{prompt_name}",
    template=initial_template,
    commit_message="Initial customer support prompt",
    tags={
        "author": "support-team@company.com",
        "use_case": "customer_service"
        "department": "customer_support",
        "language": "en"
    }
)

print(f"Created prompt '{prompt.name}' (version {prompt.version})")

手順 2: バージョニングが有効になっているアプリケーションで、プロンプトを使用する

次に、レジストリからこのプロンプトを読み込んで使用する生成AI アプリケーションを作成しましょう。mlflow.set_active_model()を使用して、アプリケーションのバージョンを追跡します。

mlflow.set_active_model()を呼び出すと、MLflow は、アプリケーションバージョンのメタデータハブとして機能するLoggedModelを作成します。この LoggedModel は、実際のアプリケーションコードを格納するのではなく、外部コード (Git コミットなど) や構成パラメーターにリンクする中央レコードとして機能し、アプリケーションが使用するレジストリからのプロンプトを自動的に追跡します。アプリケーションバージョンの追跡のしくみの詳細については、「 MLflow を使用してアプリケーションバージョンを追跡する」を参照してください。

Python
import mlflow
import subprocess
from openai import OpenAI

# Enable MLflow's autologging to instrument your application with Tracing
mlflow.openai.autolog()

# Connect to a Databricks LLM via OpenAI using the same credentials as MLflow
# Alternatively, you can use your own OpenAI credentials here
mlflow_creds = mlflow.utils.databricks_utils.get_databricks_host_creds()
client = OpenAI(
    api_key=mlflow_creds.token,
    base_url=f"{mlflow_creds.host}/serving-endpoints"
)

# Define your application and its version identifier
app_name = "customer_support_agent"

# Get current git commit hash for versioning
try:
    git_commit = (
        subprocess.check_output(["git", "rev-parse", "HEAD"])
        .decode("ascii")
        .strip()[:8]
    )
    version_identifier = f"git-{git_commit}"
except subprocess.CalledProcessError:
    version_identifier = "local-dev"  # Fallback if not in a git repo
logged_model_name = f"{app_name}-{version_identifier}"

# Set the active model context - this creates a LoggedModel that represents this version of your application
active_model_info = mlflow.set_active_model(name=logged_model_name)
print(
    f"Active LoggedModel: '{active_model_info.name}', Model ID: '{active_model_info.model_id}'"
)

# Log application parameters
# These parameters help you track the configuration of this app version
app_params = {
    "llm": "databricks-claude-sonnet-4",
    "temperature": 0.7,
    "max_tokens": 500
}
mlflow.log_model_params(model_id=active_model_info.model_id, params=app_params)

# Load the prompt from the registry
# NOTE: Loading the prompt AFTER calling set_active_model() is what enables
# automatic lineage tracking between the prompt version and the LoggedModel
prompt = mlflow.genai.load_prompt(f"prompts:/{uc_schema}.{prompt_name}/1")
print(f"Loaded prompt version {prompt.version}")

# Use the trace decorator to capture the application's entry point
# Each trace created by this function will be automatically linked to the LoggedModel (application version) we set above.  In turn, the LoggedModel is linked to the prompt version that was loaded from the registry
@mlflow.trace
def customer_support_app(company_name: str, topic: str, question: str):
    # Format the prompt with variables
    formatted_prompt = prompt.format(
        company_name=company_name,
        topic=topic,
        question=question
    )

    # Call the LLM
    response = client.chat.completions.create(
        model="databricks-claude-sonnet-4",  # Replace with your model
        messages=[
            {
                "role": "user",
                "content": formatted_prompt,
            },
        ],
        temperature=0.7,
        max_tokens=500
    )
    return response.choices[0].message.content

# Test the application
result = customer_support_app(
    company_name="TechCorp",
    topic="billing",
    question="I was charged twice for my subscription last month. Can you help?"
)
print(f"\nResponse: {result}")

ステップ3:自動リネージを表示する

ステップ 4: プロンプトを更新して変更を追跡する

プロンプトを改善して、新しいバージョンをアプリケーションで使用するときに新しいバージョンが自動的に追跡される方法を見てみましょう。

Python
# Create an improved version of the prompt
improved_template = """\
You are a helpful and empathetic customer support assistant for {{company_name}}.

Customer Topic: {{topic}}
Customer Question: {{question}}

Please provide a response that:
1. Acknowledges the customer's concern with empathy
2. Provides a clear solution or next steps
3. Offers additional assistance if needed
4. Maintains a friendly, professional tone

Remember to:
- Use the customer's name if provided
- Be concise but thorough
- Avoid technical jargon unless necessary
"""

# Register the new version
updated_prompt = mlflow.genai.register_prompt(
    name=f"{uc_schema}.{prompt_name}",
    template=improved_template,
    commit_message="Added structured response guidelines for better customer experience",
    tags={
        "author": "support-team@company.com",
        "improvement": "Added empathy guidelines and response structure"
    }
)

print(f"Created version {updated_prompt.version} of '{updated_prompt.name}'")

ステップ 5: アプリケーションで更新されたプロンプトを使用する

次に、新しいプロンプトバージョンを使用して、この変更を追跡するための新しいアプリケーションバージョンを作成しましょう。

Python
# Create a new application version
new_version_identifier = "v2-improved-prompt"
new_logged_model_name = f"{app_name}-{new_version_identifier}"

# Set the new active model
active_model_info_v2 = mlflow.set_active_model(name=new_logged_model_name)
print(
    f"Active LoggedModel: '{active_model_info_v2.name}', Model ID: '{active_model_info_v2.model_id}'"
)

# Log updated parameters
app_params_v2 = {
    "llm": "databricks-claude-sonnet-4",
    "temperature": 0.7,
    "max_tokens": 500,
    "prompt_version": "2"  # Track which prompt version we're using
}
mlflow.log_model_params(model_id=active_model_info_v2.model_id, params=app_params_v2)

# Load the new prompt version
prompt_v2 = mlflow.genai.load_prompt(f"prompts:/{uc_schema}.{prompt_name}/2")

# Update the app to use the new prompt
@mlflow.trace
def customer_support_app_v2(company_name: str, topic: str, question: str):
    # Format the prompt with variables
    formatted_prompt = prompt_v2.format(
        company_name=company_name,
        topic=topic,
        question=question
    )

    # Call the LLM
    response = client.chat.completions.create(
        model="databricks-claude-sonnet-4",
        messages=[
            {
                "role": "user",
                "content": formatted_prompt,
            },
        ],
        temperature=0.7,
        max_tokens=500
    )
    return response.choices[0].message.content

# Test with the same question to see the difference
result_v2 = customer_support_app_v2(
    company_name="TechCorp",
    topic="billing",
    question="I was charged twice for my subscription last month. Can you help?"
)
print(f"\nImproved Response: {result_v2}")

次のステップ: プロンプトバージョンを評価する

プロンプトとアプリケーションのさまざまなバージョンを追跡したので、どのプロンプトバージョンが最もパフォーマンスが高いかを体系的に評価できます。MLflowの評価フレームワークでは、 LLM ジャッジとカスタムメトリクスを使用して、複数のプロンプトバージョンを並べて比較することができます。

プロンプトのバージョンを評価する方法については、プロンプトの評価を参照してください。このガイドでは、次の方法について説明します。

異なるプロンプトバージョンで評価を実行する
評価版UIを使用したバージョン間の結果の比較
組み込みの LLM ジャッジとカスタムメトリクスの両方を使用
デプロイするプロンプト・バージョンについてデータドリブンの決定を行う

プロンプトのバージョン管理と評価を組み合わせることで、各変更が品質メトリクスにどのように影響するかを正確に把握しながら、自信を持ってプロンプトを反復的に改善できます。

次のステップ

プロンプトの評価 - さまざまなプロンプトバージョンの品質を評価する方法について説明します
本番運用トレースをアプリのバージョンにリンク - 本番運用環境でバージョンを追跡する

前提 条件​

手順 1: レジストリにプロンプトを作成する​

手順 2: バージョニングが有効になっているアプリケーションで、プロンプトを使用する​

ステップ3:自動リネージを表示する​

ステップ 4: プロンプトを更新して変更を追跡する​

ステップ 5: アプリケーションで更新されたプロンプトを使用する​

次のステップ: プロンプト バージョンを評価する​

次のステップ​