従来の入力および出力エージェントスキーマ

注記

Databricks では、エージェントを作成するために ResponsesAgent スキーマに移行することをお勧めします。コードでの AI エージェントの作成を参照してください。

AI エージェントは、Databricks の他の機能と互換性を持たせるために、特定の入力および出力スキーマ要件に準拠する必要があります。このページでは、レガシーエージェントのオーサリングシグネチャとインタフェース( ChatAgent インタフェース、 ChatModel インタフェース、 SplitChatMessageRequest 入力スキーマ、 StringResponse 出力スキーマ)の使用方法について説明します。

従来の ChatAgent エージェントを作成する

MLflow ChatAgent インターフェイスは、OpenAI ChatCompletion スキーマに似ていますが、厳密には互換性がありません。

ChatAgent は、Databricks との互換性のために既存のエージェントを簡単にラップします。

ChatAgentを作成する方法については、次のセクションの例と MLflow ドキュメント - ChatAgent インターフェイスとはを参照してください。

ChatAgentを使用してエージェントを作成および展開するには、次のものをインストールします。

databricks-agents0.16.0以上
mlflow 2.20.2以上
Python 3.10 以降。
- この要件を満たすには、サーバレスコンピュートまたは Databricks Runtime 13.3 LTS 以降を使用できます。

Python
%pip install -U -qqqq databricks-agents>=0.16.0 mlflow>=2.20.2

すでにエージェントがいる場合はどうなりますか?

LangChain、LangGraph、または同様のフレームワークで構築されたエージェントが既にある場合は、Databricks で使用するためにエージェントを書き換える必要はありません。代わりに、既存のエージェントを MLflow ChatAgent インターフェイスでラップするだけです。

mlflow.pyfunc.ChatAgentから継承する Python ラッパークラスを記述します。

ラッパークラス内では、既存のエージェントを属性 self.agent = your_existing_agentとして保持します。
ChatAgent クラスでは、非ストリーミング要求を処理するためにpredictメソッドを実装する必要があります。

predict 受け入れる必要があります:
- messages: list[ChatAgentMessage]は、それぞれがロール ("ユーザー" や "アシスタント" など)、プロンプト、ID を持つ ChatAgentMessage のリストです。
- (オプション) 追加データの context: Optional[ChatContext] と custom_inputs: Optional[dict] 。
Python
```
import uuid

# input example
[
  ChatAgentMessage(
    id=str(uuid.uuid4()),  # Generate a unique ID for each message
    role="user",
    content="What's the weather in Paris?"
  )
]
```
predict ChatAgentResponseを返す必要があります。
Python
```
import uuid

# output example
ChatAgentResponse(
  messages=[
    ChatAgentMessage(
      id=str(uuid.uuid4()),  # Generate a unique ID for each message
      role="assistant",
      content="It's sunny in Paris."
    )
  ]
)
```
フォーマット間の変換

predictでは、list[ChatAgentMessage]からの受信メッセージをエージェントが期待する入力形式に変換します。

エージェントが応答を生成したら、その出力を 1 つ以上のChatAgentMessageオブジェクトに変換し、それらを ChatAgentResponseでラップします。

ヒント

LangChain出力を自動的に変換する

LangChain エージェントをラップする場合は、mlflow.langchain.output_parsers.ChatAgentOutputParser を使用して LangChain 出力を MLflow ChatAgentMessageとChatAgentResponseスキーマに自動的に変換できます。

以下は、エージェントを変換するための簡略化されたテンプレートです。

Python
from mlflow.pyfunc import ChatAgent
from mlflow.types.agent import ChatAgentMessage, ChatAgentResponse, ChatAgentChunk
import uuid


class MyWrappedAgent(ChatAgent):
  def __init__(self, agent):
    self.agent = agent

  def predict(self, messages, context=None, custom_inputs=None):
    # Convert messages to your agent's format
    agent_input = ... # build from messages
    agent_output = self.agent.invoke(agent_input)
    # Convert output to ChatAgentMessage
    return ChatAgentResponse(
      messages=[ChatAgentMessage(role="assistant", content=agent_output, id=str(uuid.uuid4()),)]
    )

  def predict_stream(self, messages, context=None, custom_inputs=None):
    # If your agent supports streaming
    for chunk in self.agent.stream(...):
      yield ChatAgentChunk(delta=ChatAgentMessage(role="assistant", content=chunk, id=str(uuid.uuid4())))

完全な例については、次のセクションのノートブックを参照してください。

`ChatAgent` の例

次のノートブックは、一般的なライブラリ OpenAI、LangGraph、AutoGen を使用してストリーミングおよび非ストリーミング ChatAgents を作成する方法を示しています。

LangGraph
OpenAI
AutoGen
DSPy

LangGraphツール呼び出しエージェント

Open notebook in new tab

ツールを追加してこれらのエージェントの機能を拡張する方法については、 AI エージェントツールを参照してください。

ChatAgent 応答のストリーミング

ストリーミングエージェントは、より小さな増分チャンクの連続的なストリームで応答を配信します。ストリーミングは、知覚される待機時間を短縮し、会話型エージェントのユーザーエクスペリエンスを向上させます。

ストリーミングChatAgentを作成するには、ChatAgentChunkオブジェクトを生成するジェネレーターを返すpredict_streamメソッドを定義します (各ChatAgentChunkには応答の一部が含まれます)。理想的な ChatAgent ストリーミング動作の詳細については、 MLflow ドキュメントを参照してください。

次のコードは関数 predict_stream 例を示していますが、ストリーミングエージェントの完全な例については、 ChatAgentの例を参照してください。

Python
def predict_stream(
  self,
  messages: list[ChatAgentMessage],
  context: Optional[ChatContext] = None,
  custom_inputs: Optional[dict[str, Any]] = None,
) -> Generator[ChatAgentChunk, None, None]:
  # Convert messages to a format suitable for your agent
  request = {"messages": self._convert_messages_to_dict(messages)}

  # Stream the response from your agent
  for event in self.agent.stream(request, stream_mode="updates"):
    for node_data in event.values():
      # Yield each chunk of the response
      yield from (
        ChatAgentChunk(**{"delta": msg}) for msg in node_data["messages"]
      )

レガシーChatModelエージェントの作成

重要

Databricks では、エージェントや AI アプリの生成に ChatAgent インターフェイスを使用することをお勧めします。 ChatModel から ChatAgent に移行するには、 MLflow のドキュメント - ChatModel から ChatAgent への移行を参照してください。

ChatModel は、OpenAI の ChatCompletion スキーマを拡張する MLflow のレガシエージェント作成インターフェイスであり、互換性を維持できます ChatCompletion標準をサポートするプラットフォームと、カスタム機能を追加します。詳細については、「 MLflow: ChatModel の概要」を参照してください。

エージェントを mlflow.pyfunc.ChatModel のサブクラスとしてオーサリングすると、次の利点があります。

配信されたエージェントを呼び出すときにストリーミングエージェントの出力を有効にします (要求本文の {stream: true} をバイパスします)。
エージェントが提供されると、AI Gateway推論テーブルが自動的に有効になり、リクエスター名などの拡張リクエストログメタデータへのアクセスが提供されます。

警告

リクエストログと評価ログは非推奨であり、将来のリリースで削除される予定です。移行ガイダンスについては、リクエストログと評価ログの廃止を参照してください。

型指定された Python クラスを使用して、ChatCompletion スキーマと互換性のあるエージェントコードを記述できます。
MLflow は、エージェントのログを記録するときに、 input_exampleがなくても、チャットコンプリーションと互換性のあるシグネチャを自動的に推論します。これにより、エージェントの登録とデプロイのプロセスが簡素化されます。ログ記録中のモデルシグネチャの推論を参照してください。

次のコードは、Databricks ノートブックで実行するのが最適です。ノートブックは、エージェントの開発、テスト、および反復処理に便利な環境を提供します。

MyAgent クラスは mlflow.pyfunc.ChatModelを拡張し、必要な predict メソッドを実装します。これにより、Mosaic AI Agent Framework との互換性が確保されます。

このクラスには、ストリーミング出力を処理するためのオプションのメソッド ( _create_chat_completion_chunk と predict_stream ) も含まれています。

Python
# Install the latest version of mlflow
%pip install -U mlflow
dbutils.library.restartPython()

Python
import re
from typing import Optional, Dict, List, Generator
from mlflow.pyfunc import ChatModel
from mlflow.types.llm import (
  # Non-streaming helper classes
  ChatCompletionRequest,
  ChatCompletionResponse,
  ChatCompletionChunk,
  ChatMessage,
  ChatChoice,
  ChatParams,
  # Helper classes for streaming agent output
  ChatChoiceDelta,
  ChatChunkChoice,
)

class MyAgent(ChatModel):
  """
  Defines a custom agent that processes ChatCompletionRequests
  and returns ChatCompletionResponses.
  """
  def predict(self, context, messages: list[ChatMessage], params: ChatParams) -> ChatCompletionResponse:
    last_user_question_text = messages[-1].content
    response_message = ChatMessage(
      role="assistant",
      content=(
        f"I will always echo back your last question. Your last question was: {last_user_question_text}. "
      )
    )
    return ChatCompletionResponse(
      choices=[ChatChoice(message=response_message)]
    )

  def _create_chat_completion_chunk(self, content) -> ChatCompletionChunk:
    """Helper for constructing a ChatCompletionChunk instance for wrapping streaming agent output"""
    return ChatCompletionChunk(
      choices=[ChatChunkChoice(
        delta=ChatChoiceDelta(
          role="assistant",
          content=content
        )
      )]
    )

  def predict_stream(
    self, context, messages: List[ChatMessage], params: ChatParams
  ) -> Generator[ChatCompletionChunk, None, None]:
    last_user_question_text = messages[-1].content
    yield self._create_chat_completion_chunk(f"Echoing back your last question, word by word.")
    for word in re.findall(r"\S+\s*", last_user_question_text):
      yield self._create_chat_completion_chunk(word)

agent = MyAgent()
model_input = ChatCompletionRequest(
  messages=[ChatMessage(role="user", content="What is Databricks?")]
)
response = agent.predict(context=None, messages=model_input.messages, params=None)
print(response)

エージェントクラス MyAgent を 1 つのノートブックで定義するときは、別のドライバーノートブックを作成することをお勧めします。ドライバーノートブックは、エージェントを Model Registry に記録し、モデルサービングを使用してエージェントを展開します。

この分離は、MLflow の Models from Code 手法を使用してモデルをログ記録するために Databricks が推奨するワークフローに従います。

SplitChatMessageRequest 入力スキーマ (非推奨)

SplitChatMessagesRequest では、現在のクエリと履歴をエージェント入力として別々に渡すことができます。

Python
  question = {
    "query": "What is MLflow",
    "history": [
      {
        "role": "user",
        "content": "What is Retrieval-augmented Generation?"
      },
      {
        "role": "assistant",
        "content": "RAG is"
      }
    ]
  }

StringResponse 出力スキーマ (非推奨)

StringResponse エージェントの応答を、単一の文字列 content フィールドを持つオブジェクトとして返すことができます。

{"content": "This is an example string response"}

従来の入力および出力エージェントスキーマ

従来の ChatAgent エージェントを作成する

すでにエージェントがいる場合はどうなりますか?

`ChatAgent` の例

LangGraphツール呼び出しエージェント

OpenAI ツール呼び出しエージェント

OpenAI Responses API ツール呼び出しエージェント

OpenAI チャット専用エージェント

AutoGen ツール呼び出しエージェント

DSPy チャット専用エージェント

ChatAgent 応答のストリーミング

レガシーChatModelエージェントの作成

SplitChatMessageRequest 入力スキーマ (非推奨)

StringResponse 出力スキーマ (非推奨)

従来の ChatAgent エージェントを作成する​

すでにエージェントがいる場合はどうなりますか?​

ChatAgent の例​

LangGraphツール呼び出しエージェント

OpenAI ツール呼び出しエージェント

OpenAI Responses API ツール呼び出しエージェント

OpenAI チャット専用エージェント

AutoGen ツール呼び出しエージェント

DSPy チャット専用エージェント

ChatAgent 応答のストリーミング​

レガシーChatModelエージェントの作成​

SplitChatMessageRequest 入力スキーマ (非推奨)​

StringResponse 出力スキーマ (非推奨)​

従来の ChatAgent エージェントを作成する

すでにエージェントがいる場合はどうなりますか?

`ChatAgent` の例

ChatAgent 応答のストリーミング

レガシーChatModelエージェントの作成

SplitChatMessageRequest 入力スキーマ (非推奨)

StringResponse 出力スキーマ (非推奨)