Unity AI Gatewayエンドポイントへのクエリ

備考

ベータ版

この機能はベータ版です。アカウント管理者は、アカウントコンソールの [プレビュー] ページからこの機能へのアクセスを制御できます。「Databricks プレビューの管理」を参照してください。

このページでは、サポートされているAPIsを使用してUnity AI Gatewayエンドポイントにクエリを実行する方法について説明します。

要件

お客様のアカウントでUnity AI Gatewayのプレビュー版が有効になりました。Databricksのプレビューを管理するを参照してください。
Unity AI Gateway がサポートされるリージョン内のDatabricksワークスペース。
ワークスペースで Unity Catalog が有効になりました。「Unity Catalog のワークスペースを有効にする」を参照してください。

サポートされているAPIsと統合

Unity AI Gatewayは、以下のAPIsと統合をサポートしています。

統合APIs : Databricks上のモデルをクエリするためのOpenAI互換インターフェース。各モデルのクエリ方法を変更することなく、異なるプロバイダーのモデル間をシームレスに切り替えることができます。
ネイティブAPIs : 最新のモデルとプロバイダー固有の機能にアクセスするためのプロバイダー固有のインターフェース。
コーディングエージェント : コーディングエージェントを Unity AI Gateway と統合して、 AI支援開発ワークフローに一元的なガバナンスとモニタリングを追加します。コーディングエージェントとの統合を参照してください。
Databricks Apps上のエージェント ：Unity AI Gateway経由でLLMトラフィックをルーティングするAIエージェントをDatabricks Apps上に作成およびデプロイします。「ステップ 4. Unity AI Gateway を使用してDatabricks Apps上のエージェントからのLLM使用を管理する」を参照してください。
ai_query : バッチ推論のために、SQL または Python から Databricks が提供する Unity AI Gateway エンドポイントにクエリを実行するには、 ai_queryを使用します。ai_queryを使用したクエリエンドポイントを参照してください。

クエリエンドポイント `ai_query`

ai_query関数を使用すると、SQL または Python から Databricks が提供する Unity AI Gateway エンドポイントに直接クエリを実行できます。これにより、バッチ推論ワークロードの使用状況追跡情報を取得できます。

注記

ai_query Unity AI Gateway のサポートは、Databricks が提供するエンドポイント (例: databricks-gpt-5-4またはdatabricks-claude-sonnet-4 ) でのみ利用可能です。Unity AI Gatewayで作成したエンドポイントは、現時点ではサポートされていません。
使用状況の追跡は、 ai_queryのバッチ推論ワークロードにのみ適用されます。レート制限、ガードレール、推論テーブル、フォールバックなどのUnity AI Gatewayのその他の機能は適用されません。

利用を開始するには以下の手順を踏みます。

アカウントでUnity AI Gatewayのプレビューを有効にしてください。Databricksのプレビューを管理するを参照してください。
ai_queryを使用して Databricks が提供するエンドポイントにクエリを実行します。

SQL
SELECT ai_query(
  'databricks-gpt-5-4',
  'Summarize the following text: ' || text_column
) AS summary
FROM my_table
LIMIT 10

ai_queryを介して Databricks が提供するエンドポイントに対して行われたリクエストは、使用状況追跡システムテーブル ( system.ai_gateway.usage ) に記録されます。これらのリクエストは、組み込みの使用状況ダッシュボードにも表示されます。

完全なai_query構文と問題のリファレンスについては、 ai_query関数」を参照してください。ベストプラクティスとサポートされているモデルについては、 Use ai_queryを参照してください。

統合APIsでエンドポイントをクエリする

統合APIs 、 Databricks上のモデルをクエリするためのOpenAI互換インターフェースを提供します。統合されたAPIsを使用すると、コードを変更せずに、さまざまなプロバイダーのモデル間をシームレスに切り替えることができます。

MLflow チャット完了 API

MLflow チャット完了 API

Python
REST API

Python
from openai import OpenAI
import os

DATABRICKS_TOKEN = os.environ.get('DATABRICKS_TOKEN')

client = OpenAI(
  api_key=DATABRICKS_TOKEN,
  base_url="https://<workspace-url>/ai-gateway/mlflow/v1"
)

chat_completion = client.chat.completions.create(
  messages=[
    {"role": "user", "content": "Hello!"},
    {"role": "assistant", "content": "Hello! How can I assist you today?"},
    {"role": "user", "content": "What is Databricks?"},
  ],
  model="<ai-gateway-endpoint>",
  max_tokens=256
)

print(chat_completion.choices[0].message.content)

Bash
curl \
  -u token:$DATABRICKS_TOKEN \
  -X POST \
  -H "Content-Type: application/json" \
  -d '{
    "model": "<ai-gateway-endpoint>",
    "max_tokens": 256,
    "messages": [
      {"role": "user", "content": "Hello!"},
      {"role": "assistant", "content": "Hello! How can I assist you today?"},
      {"role": "user", "content": "What is Databricks?"}
    ]
  }' \
  https://<workspace-url>/ai-gateway/mlflow/v1/chat/completions

<workspace-url> Databricks ワークスペースの URL に、 <ai-gateway-endpoint> Unity AI Gateway のエンドポイント名に置き換えてください。

MLflow 埋め込み API

MLflow 埋め込み API

Python
REST API

Python
from openai import OpenAI
import os

DATABRICKS_TOKEN = os.environ.get('DATABRICKS_TOKEN')

client = OpenAI(
  api_key=DATABRICKS_TOKEN,
  base_url="https://<workspace-url>/ai-gateway/mlflow/v1"
)

embeddings = client.embeddings.create(
  input="What is Databricks?",
  model="<ai-gateway-endpoint>"
)

print(embeddings.data[0].embedding)

Bash
curl \
  -u token:$DATABRICKS_TOKEN \
  -X POST \
  -H "Content-Type: application/json" \
  -d '{
    "model": "<ai-gateway-endpoint>",
    "input": "What is Databricks?"
  }' \
  https://<workspace-url>/ai-gateway/mlflow/v1/embeddings

<workspace-url> Databricks ワークスペースの URL に、 <ai-gateway-endpoint> Unity AI Gateway のエンドポイント名に置き換えてください。

スーパーバイザーAPI

スーパーバイザーAPI

Supervisor API ( /mlflow/v1/responses ) は、 OpenResponses と互換性があり、プロバイダーに依存しない、ベータ版のエージェント構築用 API です。アカウント管理者は、 プレビュー ページからアクセスを有効にできます。Databricksのプレビューを管理するを参照してください。コードを変更することなく、プロバイダーを問わず、エージェントのユースケースに最適なモデルを選択できます。

Python
REST API

Python
from openai import OpenAI
import os

DATABRICKS_TOKEN = os.environ.get('DATABRICKS_TOKEN')

client = OpenAI(
  api_key=DATABRICKS_TOKEN,
  base_url="https://<workspace-url>/ai-gateway/mlflow/v1"
)

response = client.responses.create(
  model="<ai-gateway-endpoint>",
  input=[{"role": "user", "content": "What is Databricks?"}]
)

print(response.output_text)

Bash
curl \
  -u token:$DATABRICKS_TOKEN \
  -X POST \
  -H "Content-Type: application/json" \
  -d '{
    "model": "<ai-gateway-endpoint>",
    "input": [
      {"role": "user", "content": "What is Databricks?"}
    ]
  }' \
  https://<workspace-url>/ai-gateway/mlflow/v1/responses

<workspace-url> Databricks ワークスペースの URL に、 <ai-gateway-endpoint> Unity AI Gateway のエンドポイント名に置き換えてください。

ネイティブAPIsを使用してエンドポイントをクエリする

ネイティブAPIs Databricks上のモデルをクエリするためのプロバイダー固有のインターフェイスを提供します。ネイティブAPIsを使用して、プロバイダー固有の最新機能にアクセスします。

OpenAI Responses API

OpenAI レスポンス API

Python
REST API

Python
from openai import OpenAI
import os

DATABRICKS_TOKEN = os.environ.get('DATABRICKS_TOKEN')

client = OpenAI(
  api_key=DATABRICKS_TOKEN,
  base_url="https://<workspace-url>/ai-gateway/openai/v1"
)

response = client.responses.create(
  model="<ai-gateway-endpoint>",
  max_output_tokens=256,
  input=[
    {
      "role": "user",
      "content": [{"type": "input_text", "text": "Hello!"}]
    },
    {
      "role": "assistant",
      "content": [{"type": "output_text", "text": "Hello! How can I assist you today?"}]
    },
    {
      "role": "user",
      "content": [{"type": "input_text", "text": "What is Databricks?"}]
    }
  ]
)

print(response.output)

Bash
curl \
  -u token:$DATABRICKS_TOKEN \
  -X POST \
  -H "Content-Type: application/json" \
  -d '{
    "model": "<ai-gateway-endpoint>",
    "max_output_tokens": 256,
    "input": [
      {
        "role": "user",
        "content": [{"type": "input_text", "text": "Hello!"}]
      },
      {
        "role": "assistant",
        "content": [{"type": "output_text", "text": "Hello! How can I assist you today?"}]
      },
      {
        "role": "user",
        "content": [{"type": "input_text", "text": "What is Databricks?"}]
      }
    ]
  }' \
  https://<workspace-url>/ai-gateway/openai/v1/responses

<workspace-url> Databricks ワークスペースの URL に、 <ai-gateway-endpoint> Unity AI Gateway のエンドポイント名に置き換えてください。

Anthropic Messages API

Anthropic Messages API

Python
REST API

Python
import anthropic
import os

DATABRICKS_TOKEN = os.environ.get('DATABRICKS_TOKEN')

client = anthropic.Anthropic(
  api_key="unused",
  base_url="https://<workspace-url>/ai-gateway/anthropic",
  default_headers={
    &quot;Authorization&quot;: f&quot;Bearer {DATABRICKS_TOKEN}&quot;,
  },
)

message = client.messages.create(
  model="<ai-gateway-endpoint>",
  max_tokens=256,
  messages=[
    {"role": "user", "content": "Hello!"},
    {"role": "assistant", "content": "Hello! How can I assist you today?"},
    {"role": "user", "content": "What is Databricks?"},
  ],
)

print(message.content[0].text)

Bash
curl \
  -u token:$DATABRICKS_TOKEN \
  -X POST \
  -H "Content-Type: application/json" \
  -d '{
    "model": "<ai-gateway-endpoint>",
    "max_tokens": 256,
    "messages": [
      {"role": "user", "content": "Hello!"},
      {"role": "assistant", "content": "Hello! How can I assist you today?"},
      {"role": "user", "content": "What is Databricks?"}
    ]
  }' \
  https://<workspace-url>/ai-gateway/anthropic/v1/messages

<workspace-url> Databricks ワークスペースの URL に、 <ai-gateway-endpoint> Unity AI Gateway のエンドポイント名に置き換えてください。

Google Gemini API

Google Gemini API

Python
REST API

Python
from google import genai
from google.genai import types
import os

DATABRICKS_TOKEN = os.environ.get('DATABRICKS_TOKEN')

client = genai.Client(
  api_key="databricks",
  http_options=types.HttpOptions(
    base_url="https://<workspace-url>/ai-gateway/gemini",
    headers={
      &quot;Authorization&quot;: f&quot;Bearer {DATABRICKS_TOKEN}&quot;,
    },
  ),
)

response = client.models.generate_content(
  model="<ai-gateway-endpoint>",
  contents=[
    types.Content(
      role="user",
      parts=[types.Part(text="Hello!")],
    ),
    types.Content(
      role="model",
      parts=[types.Part(text="Hello! How can I assist you today?")],
    ),
    types.Content(
      role="user",
      parts=[types.Part(text="What is Databricks?")],
    ),
  ],
  config=types.GenerateContentConfig(
    max_output_tokens=256,
  ),
)

print(response.text)

Bash
curl \
  -u token:$DATABRICKS_TOKEN \
  -X POST \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [
      {
        "role": "user",
        "parts": [{"text": "Hello!"}]
      },
      {
        "role": "model",
        "parts": [{"text": "Hello! How can I assist you today?"}]
      },
      {
        "role": "user",
        "parts": [{"text": "What is Databricks?"}]
      }
    ],
    "generationConfig": {
      "maxOutputTokens": 256
    }
  }' \
  https://<workspace-url>/ai-gateway/gemini/v1beta/models/<ai-gateway-endpoint>:generateContent

<workspace-url> Databricks ワークスペースの URL に、 <ai-gateway-endpoint> Unity AI Gateway のエンドポイント名に置き換えてください。

使用状況追跡のためのタグリクエスト

Databricks-Ai-Gateway-Request-Tags HTTPヘッダーを使用すると、個々のリクエストにカスタムキーと値のタグを付加できます。リクエストタグは、使用状況追跡システムテーブルと推論テーブルの両方のrequest_tags列に記録されるため、コストを追跡し、使用状況を属性化し、プロジェクト、チーム、環境、またはその他の次元ごとに分析をフィルター処理できるようになります。

ヘッダー値は、文字列キーと文字列値をマッピングするJSONオブジェクトである必要があります。例えば：

JSON
{ "project": "chatbot", "team": "ml-platform", "environment": "production" }

extra_headers問題 ( Python ) を使用するか、ヘッダーを直接渡す ( REST API ) とリクエストにタグを添付します。

Python (OpenAI SDK)
Python (Anthropic SDK)
REST API

Python
from openai import OpenAI
import json
import os

DATABRICKS_TOKEN = os.environ.get('DATABRICKS_TOKEN')

client = OpenAI(
  api_key=DATABRICKS_TOKEN,
  base_url="https://<workspace-url>/ai-gateway/mlflow/v1"
)

request_tags = {"project": "chatbot", "team": "ml-platform"}

chat_completion = client.chat.completions.create(
  messages=[
    {"role": "user", "content": "What is Databricks?"},
  ],
  model="<ai-gateway-endpoint>",
  max_tokens=256,
  extra_headers={
    &quot;Databricks-Ai-Gateway-Request-Tags&quot;: json.dumps(request_tags)
  }
)

Python
import anthropic
import json
import os

DATABRICKS_TOKEN = os.environ.get('DATABRICKS_TOKEN')

request_tags = {"project": "chatbot", "team": "ml-platform"}

client = anthropic.Anthropic(
  api_key="unused",
  base_url="https://<workspace-url>/ai-gateway/anthropic",
  default_headers={
    &quot;Authorization&quot;: f&quot;Bearer {DATABRICKS_TOKEN}&quot;,
    &quot;Databricks-Ai-Gateway-Request-Tags&quot;: json.dumps(request_tags),
  },
)

message = client.messages.create(
  model="<ai-gateway-endpoint>",
  max_tokens=256,
  messages=[
    {"role": "user", "content": "What is Databricks?"},
  ],
)

Bash
curl \
  -u token:$DATABRICKS_TOKEN \
  -X POST \
  -H "Content-Type: application/json" \
  -H 'Databricks-Ai-Gateway-Request-Tags: {"project": "chatbot", "team": "ml-platform"}' \
  -d '{
    "model": "<ai-gateway-endpoint>",
    "max_tokens": 256,
    "messages": [
      {"role": "user", "content": "What is Databricks?"}
    ]
  }' \
  https://<workspace-url>/ai-gateway/mlflow/v1/chat/completions

<workspace-url> Databricks ワークスペースの URL に、 <ai-gateway-endpoint> Unity AI Gateway のエンドポイント名に置き換えてください。

次のステップ

エージェントおよびLLM向けのUnity AI Gateway
Unity AI Gatewayエンドポイントの設定
コーディングエージェントとの統合
Supervisor API (ベータ版) — ホストされたツールを使用して複数ターンのエージェントワークフローを実行します /mlflow/v1/responses
4. Unity AI Gateway を使用してDatabricks AppsのエージェントからのLLM使用を管理する— Databricks AppsのエージェントからのLLM呼び出しを Unity AI Gateway 経由でルーティングします
Unity AI Gatewayエンドポイントの使用状況を監視します。
推論テーブルを使用してモデルを監視する
Unity AI Gatewayエンドポイントのレート制限を設定する

要件​

サポートされているAPIsと統合​

クエリエンドポイント ai_query​

統合APIsでエンドポイントをクエリする​

MLflow チャット完了 API​

MLflow 埋め込み API​

スーパーバイザーAPI​

ネイティブAPIsを使用してエンドポイントをクエリする​

OpenAI レスポンス API​

Anthropic Messages API​

Google Gemini API​

使用状況追跡のためのタグリクエスト​

次のステップ​

要件

サポートされているAPIsと統合

クエリエンドポイント `ai_query`

統合APIsでエンドポイントをクエリする

MLflow チャット完了 API

MLflow 埋め込み API

スーパーバイザーAPI

ネイティブAPIsを使用してエンドポイントをクエリする

OpenAI レスポンス API

Anthropic Messages API

Google Gemini API

使用状況追跡のためのタグリクエスト

次のステップ