Unity AI Gatewayエンドポイントへのクエリ

備考

ベータ版

この機能はベータ版です。アカウント管理者は、アカウントコンソールの [プレビュー] ページからこの機能へのアクセスを制御できます。「Databricks プレビューの管理」を参照してください。

このページでは、サポートされているAPIsを使用してUnity AI Gatewayエンドポイントにクエリを実行する方法について説明します。

要件

お客様のアカウントでUnity AI Gatewayのプレビュー版が有効になりました。Databricksのプレビューを管理するを参照してください。
Unity AI Gateway がサポートされるリージョン内のDatabricksワークスペース。
ワークスペースで Unity Catalog が有効になりました。「Unity Catalog のワークスペースを有効にする」を参照してください。

サポートされているAPIsと統合

Unity AI Gatewayは、以下のAPIsと統合をサポートしています。

統合APIs : Databricks上のモデルをクエリするためのOpenAI互換インターフェース。各モデルのクエリ方法を変更することなく、異なるプロバイダーのモデル間をシームレスに切り替えることができます。
ネイティブAPIs : 最新のモデルとプロバイダー固有の機能にアクセスするためのプロバイダー固有のインターフェース。
コーディングエージェント : コーディングエージェントを Unity AI Gateway と統合して、 AI支援開発ワークフローに一元的なガバナンスとモニタリングを追加します。コーディングエージェントとの統合を参照してください。
Databricks Apps上のエージェント ：Unity AI Gateway経由でLLMトラフィックをルーティングするAIエージェントをDatabricks Apps上に作成およびデプロイします。「ステップ 4. Unity AI Gateway を使用してDatabricks Apps上のエージェントからのLLM使用を管理する」を参照してください。
ai_query : バッチ推論のために、SQL または Python から Databricks が提供する Unity AI Gateway エンドポイントにクエリを実行するには、 ai_queryを使用します。ai_queryを使用したクエリエンドポイントを参照してください。

クエリエンドポイント `ai_query`

ai_query関数を使用すると、SQL または Python から Databricks が提供する Unity AI Gateway エンドポイントに直接クエリを実行できます。これにより、バッチ推論ワークロードの使用状況追跡情報を取得できます。

注記

ai_query Unity AI Gateway のサポートは、Databricks が提供するエンドポイント (例: databricks-gpt-5-4またはdatabricks-claude-sonnet-4 ) でのみ利用可能です。Unity AI Gatewayで作成したエンドポイントは、現時点ではサポートされていません。
使用状況の追跡は、 ai_queryのバッチ推論ワークロードにのみ適用されます。レート制限、ガードレール、推論テーブル、フォールバックなどのUnity AI Gatewayのその他の機能は適用されません。

利用を開始するには以下の手順を踏みます。

アカウントでUnity AI Gatewayのプレビューを有効にしてください。Databricksのプレビューを管理するを参照してください。
ai_queryを使用して Databricks が提供するエンドポイントにクエリを実行します。

SQL
SELECT ai_query(
  'databricks-gpt-5-4',
  'Summarize the following text: ' || text_column
) AS summary
FROM my_table
LIMIT 10

ai_queryを介して Databricks が提供するエンドポイントに対して行われたリクエストは、使用状況追跡システムテーブル ( system.ai_gateway.usage ) に記録されます。これらのリクエストは、組み込みの使用状況ダッシュボードにも表示されます。

完全なai_query構文と問題のリファレンスについては、 ai_query関数」を参照してください。ベストプラクティスとサポートされているモデルについては、 Use ai_queryを参照してください。

統合APIsでエンドポイントをクエリする

統合APIs 、 Databricks上のモデルをクエリするためのOpenAI互換インターフェースを提供します。統合されたAPIsを使用すると、コードを変更せずに、さまざまなプロバイダーのモデル間をシームレスに切り替えることができます。

MLflow チャット完了 API

MLflow チャット完了 API

Python
REST API

Python
from openai import OpenAI
import os

DATABRICKS_TOKEN = os.environ.get('DATABRICKS_TOKEN')

client = OpenAI(
  api_key=DATABRICKS_TOKEN,
  base_url="https://<workspace-url>/ai-gateway/mlflow/v1"
)

chat_completion = client.chat.completions.create(
  messages=[
    {"role": "user", "content": "Hello!"},
    {"role": "assistant", "content": "Hello! How can I assist you today?"},
    {"role": "user", "content": "What is Databricks?"},
  ],
  model="<ai-gateway-endpoint>",
  max_tokens=256
)

print(chat_completion.choices[0].message.content)

Bash
curl \
  -u token:$DATABRICKS_TOKEN \
  -X POST \
  -H "Content-Type: application/json" \
  -d '{
    "model": "<ai-gateway-endpoint>",
    "max_tokens": 256,
    "messages": [
      {"role": "user", "content": "Hello!"},
      {"role": "assistant", "content": "Hello! How can I assist you today?"},
      {"role": "user", "content": "What is Databricks?"}
    ]
  }' \
  https://<workspace-url>/ai-gateway/mlflow/v1/chat/completions

<workspace-url> Databricks ワークスペースの URL に、 <ai-gateway-endpoint> Unity AI Gateway のエンドポイント名に置き換えてください。

MLflow 埋め込み API

MLflow 埋め込み API

Python
REST API

Python
from openai import OpenAI
import os

DATABRICKS_TOKEN = os.environ.get('DATABRICKS_TOKEN')

client = OpenAI(
  api_key=DATABRICKS_TOKEN,
  base_url="https://<workspace-url>/ai-gateway/mlflow/v1"
)

embeddings = client.embeddings.create(
  input="What is Databricks?",
  model="<ai-gateway-endpoint>"
)

print(embeddings.data[0].embedding)

Bash
curl \
  -u token:$DATABRICKS_TOKEN \
  -X POST \
  -H "Content-Type: application/json" \
  -d '{
    "model": "<ai-gateway-endpoint>",
    "input": "What is Databricks?"
  }' \
  https://<workspace-url>/ai-gateway/mlflow/v1/embeddings

<workspace-url> Databricks ワークスペースの URL に、 <ai-gateway-endpoint> Unity AI Gateway のエンドポイント名に置き換えてください。

スーパーバイザーAPI

スーパーバイザーAPI

ネイティブAPIsを使用してエンドポイントをクエリする

ネイティブAPIs Databricks上のモデルをクエリするためのプロバイダー固有のインターフェイスを提供します。ネイティブAPIsを使用して、プロバイダー固有の最新機能にアクセスします。

OpenAI Responses API

OpenAI レスポンス API

Python
REST API

Python
from openai import OpenAI
import os

DATABRICKS_TOKEN = os.environ.get('DATABRICKS_TOKEN')

client = OpenAI(
  api_key=DATABRICKS_TOKEN,
  base_url="https://<workspace-url>/ai-gateway/openai/v1"
)

response = client.responses.create(
  model="<ai-gateway-endpoint>",
  max_output_tokens=256,
  input=[
    {
      "role": "user",
      "content": [{"type": "input_text", "text": "Hello!"}]
    },
    {
      "role": "assistant",
      "content": [{"type": "output_text", "text": "Hello! How can I assist you today?"}]
    },
    {
      "role": "user",
      "content": [{"type": "input_text", "text": "What is Databricks?"}]
    }
  ]
)

print(response.output)

Bash
curl \
  -u token:$DATABRICKS_TOKEN \
  -X POST \
  -H "Content-Type: application/json" \
  -d '{
    "model": "<ai-gateway-endpoint>",
    "max_output_tokens": 256,
    "input": [
      {
        "role": "user",
        "content": [{"type": "input_text", "text": "Hello!"}]
      },
      {
        "role": "assistant",
        "content": [{"type": "output_text", "text": "Hello! How can I assist you today?"}]
      },
      {
        "role": "user",
        "content": [{"type": "input_text", "text": "What is Databricks?"}]
      }
    ]
  }' \
  https://<workspace-url>/ai-gateway/openai/v1/responses

<workspace-url> Databricks ワークスペースの URL に、 <ai-gateway-endpoint> Unity AI Gateway のエンドポイント名に置き換えてください。

Anthropic Messages API

Anthropic Messages API

Python
REST API

Python
import anthropic
import os

DATABRICKS_TOKEN = os.environ.get('DATABRICKS_TOKEN')

client = anthropic.Anthropic(
  api_key="unused",
  base_url="https://<workspace-url>/ai-gateway/anthropic",
  default_headers={
    &quot;Authorization&quot;: f&quot;Bearer {DATABRICKS_TOKEN}&quot;,
  },
)

message = client.messages.create(
  model="<ai-gateway-endpoint>",
  max_tokens=256,
  messages=[
    {"role": "user", "content": "Hello!"},
    {"role": "assistant", "content": "Hello! How can I assist you today?"},
    {"role": "user", "content": "What is Databricks?"},
  ],
)

print(message.content[0].text)

Bash
curl \
  -u token:$DATABRICKS_TOKEN \
  -X POST \
  -H "Content-Type: application/json" \
  -d '{
    "model": "<ai-gateway-endpoint>",
    "max_tokens": 256,
    "messages": [
      {"role": "user", "content": "Hello!"},
      {"role": "assistant", "content": "Hello! How can I assist you today?"},
      {"role": "user", "content": "What is Databricks?"}
    ]
  }' \
  https://<workspace-url>/ai-gateway/anthropic/v1/messages

<workspace-url> Databricks ワークスペースの URL に、 <ai-gateway-endpoint> Unity AI Gateway のエンドポイント名に置き換えてください。

Google Gemini API

Google Gemini API

Python
REST API

Python
from google import genai
from google.genai import types
import os

DATABRICKS_TOKEN = os.environ.get('DATABRICKS_TOKEN')

client = genai.Client(
  api_key="databricks",
  http_options=types.HttpOptions(
    base_url="https://<workspace-url>/ai-gateway/gemini",
    headers={
      &quot;Authorization&quot;: f&quot;Bearer {DATABRICKS_TOKEN}&quot;,
    },
  ),
)

response = client.models.generate_content(
  model="<ai-gateway-endpoint>",
  contents=[
    types.Content(
      role="user",
      parts=[types.Part(text="Hello!")],
    ),
    types.Content(
      role="model",
      parts=[types.Part(text="Hello! How can I assist you today?")],
    ),
    types.Content(
      role="user",
      parts=[types.Part(text="What is Databricks?")],
    ),
  ],
  config=types.GenerateContentConfig(
    max_output_tokens=256,
  ),
)

print(response.text)

Bash
curl \
  -u token:$DATABRICKS_TOKEN \
  -X POST \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [
      {
        "role": "user",
        "parts": [{"text": "Hello!"}]
      },
      {
        "role": "model",
        "parts": [{"text": "Hello! How can I assist you today?"}]
      },
      {
        "role": "user",
        "parts": [{"text": "What is Databricks?"}]
      }
    ],
    "generationConfig": {
      "maxOutputTokens": 256
    }
  }' \
  https://<workspace-url>/ai-gateway/gemini/v1beta/models/<ai-gateway-endpoint>:generateContent

<workspace-url> Databricks ワークスペースの URL に、 <ai-gateway-endpoint> Unity AI Gateway のエンドポイント名に置き換えてください。

使用状況追跡のためのタグリクエスト

Databricks-Ai-Gateway-Request-Tags HTTPヘッダーを使用すると、個々のリクエストにカスタムキーと値のタグを付加できます。リクエストタグは、使用状況追跡システムテーブルと推論テーブルの両方のrequest_tags列に記録されるため、コストを追跡し、使用状況を属性化し、プロジェクト、チーム、環境、またはその他の次元ごとに分析をフィルター処理できるようになります。

ヘッダー値は、文字列キーと文字列値をマッピングするJSONオブジェクトである必要があります。例えば：

JSON
{ "project": "chatbot", "team": "ml-platform", "environment": "production" }

extra_headers問題 ( Python ) を使用するか、ヘッダーを直接渡す ( REST API ) とリクエストにタグを添付します。

Python (OpenAI SDK)
Python (Anthropic SDK)
REST API

Python
from openai import OpenAI
import json
import os

DATABRICKS_TOKEN = os.environ.get('DATABRICKS_TOKEN')

client = OpenAI(
  api_key=DATABRICKS_TOKEN,
  base_url="https://<workspace-url>/ai-gateway/mlflow/v1"
)

request_tags = {"project": "chatbot", "team": "ml-platform"}

chat_completion = client.chat.completions.create(
  messages=[
    {"role": "user", "content": "What is Databricks?"},
  ],
  model="<ai-gateway-endpoint>",
  max_tokens=256,
  extra_headers={
    &quot;Databricks-Ai-Gateway-Request-Tags&quot;: json.dumps(request_tags)
  }
)

Python
import anthropic
import json
import os

DATABRICKS_TOKEN = os.environ.get('DATABRICKS_TOKEN')

request_tags = {"project": "chatbot", "team": "ml-platform"}

client = anthropic.Anthropic(
  api_key="unused",
  base_url="https://<workspace-url>/ai-gateway/anthropic",
  default_headers={
    &quot;Authorization&quot;: f&quot;Bearer {DATABRICKS_TOKEN}&quot;,
    &quot;Databricks-Ai-Gateway-Request-Tags&quot;: json.dumps(request_tags),
  },
)

message = client.messages.create(
  model="<ai-gateway-endpoint>",
  max_tokens=256,
  messages=[
    {"role": "user", "content": "What is Databricks?"},
  ],
)

Bash
curl \
  -u token:$DATABRICKS_TOKEN \
  -X POST \
  -H "Content-Type: application/json" \
  -H 'Databricks-Ai-Gateway-Request-Tags: {"project": "chatbot", "team": "ml-platform"}' \
  -d '{
    "model": "<ai-gateway-endpoint>",
    "max_tokens": 256,
    "messages": [
      {"role": "user", "content": "What is Databricks?"}
    ]
  }' \
  https://<workspace-url>/ai-gateway/mlflow/v1/chat/completions

<workspace-url> Databricks ワークスペースの URL に、 <ai-gateway-endpoint> Unity AI Gateway のエンドポイント名に置き換えてください。

次のステップ

エージェントおよびLLM向けのUnity AI Gateway
Unity AI Gatewayエンドポイントの設定
コーディングエージェントとの統合
4. Unity AI Gateway を使用してDatabricks AppsのエージェントからのLLM使用を管理する— Databricks AppsのエージェントからのLLM呼び出しを Unity AI Gateway 経由でルーティングします
Unity AI Gatewayエンドポイントの使用状況を監視します。
推論テーブルを使用してモデルを監視する
Unity AI Gatewayエンドポイントのレート制限を設定する

要件​

サポートされているAPIsと統合​

クエリエンドポイント ai_query​

統合APIsでエンドポイントをクエリする​

MLflow チャット完了 API​

MLflow 埋め込み API​

スーパーバイザーAPI​

ネイティブAPIsを使用してエンドポイントをクエリする​

OpenAI レスポンス API​

Anthropic Messages API​

Google Gemini API​

使用状況追跡のためのタグリクエスト​

次のステップ​

要件

サポートされているAPIsと統合

クエリエンドポイント `ai_query`

統合APIsでエンドポイントをクエリする

MLflow チャット完了 API

MLflow 埋め込み API

スーパーバイザーAPI

ネイティブAPIsを使用してエンドポイントをクエリする

OpenAI レスポンス API

Anthropic Messages API

Google Gemini API

使用状況追跡のためのタグリクエスト

次のステップ