Skip to main content

Query Unity AI Gateway endpoints

Beta

This feature is in Beta. Account admins can control access to this feature from the account console Previews page. See Manage Databricks previews.

This page describes how to query Unity AI Gateway endpoints using supported APIs.

Requirements

Supported APIs and integrations

Unity AI Gateway supports the following APIs and integrations:

  • Unified APIs: OpenAI-compatible interfaces to query models on Databricks. Seamlessly switch between models from different providers without changing how you query each model.
  • Native APIs: Provider-specific interfaces to access the latest model and provider-specific features.
  • Coding agents: Integrate your coding agents with Unity AI Gateway to add centralized governance and monitoring to your AI-assisted development workflows. See Integrate with coding agents.

Query endpoints with unified APIs

Unified APIs offer an OpenAI-compatible interface to query models on Databricks. Use unified APIs to seamlessly switch between models from different providers without changing your code.

MLflow Chat Completions API

MLflow Chat Completions API

Python
from openai import OpenAI
import os

DATABRICKS_TOKEN = os.environ.get('DATABRICKS_TOKEN')

client = OpenAI(
api_key=DATABRICKS_TOKEN,
base_url="https://<ai-gateway-url>/mlflow/v1"
)

chat_completion = client.chat.completions.create(
messages=[
{"role": "user", "content": "Hello!"},
{"role": "assistant", "content": "Hello! How can I assist you today?"},
{"role": "user", "content": "What is Databricks?"},
],
model="<ai-gateway-endpoint>",
max_tokens=256
)

print(chat_completion.choices[0].message.content)

Replace <ai-gateway-url> with your Unity AI Gateway URL and <ai-gateway-endpoint> with your Unity AI Gateway endpoint name.

MLflow Embeddings API

MLflow Embeddings API

Python
from openai import OpenAI
import os

DATABRICKS_TOKEN = os.environ.get('DATABRICKS_TOKEN')

client = OpenAI(
api_key=DATABRICKS_TOKEN,
base_url="https://<ai-gateway-url>/mlflow/v1"
)

embeddings = client.embeddings.create(
input="What is Databricks?",
model="<ai-gateway-endpoint>"
)

print(embeddings.data[0].embedding)

Replace <ai-gateway-url> with your Unity AI Gateway URL and <ai-gateway-endpoint> with your Unity AI Gateway endpoint name.

Supervisor API

Supervisor API

The Supervisor API (/mlflow/v1/responses) is an OpenResponses-compatible, provider-agnostic API for building agents in Beta. Account admins can enable access from the Previews page. See Manage Databricks previews. Pick the best model for your agent use case across providers, without changing your code.

Python
from openai import OpenAI
import os

DATABRICKS_TOKEN = os.environ.get('DATABRICKS_TOKEN')

client = OpenAI(
api_key=DATABRICKS_TOKEN,
base_url="https://<ai-gateway-url>/mlflow/v1"
)

response = client.responses.create(
model="<ai-gateway-endpoint>",
input=[{"role": "user", "content": "What is Databricks?"}]
)

print(response.output_text)

Replace <ai-gateway-url> with your AI Gateway URL and <ai-gateway-endpoint> with your AI Gateway endpoint name.

Query endpoints with native APIs

Native APIs offer provider-specific interfaces to query models on Databricks. Use native APIs to access the latest provider-specific features.

OpenAI Responses API

OpenAI Responses API

Python
from openai import OpenAI
import os

DATABRICKS_TOKEN = os.environ.get('DATABRICKS_TOKEN')

client = OpenAI(
api_key=DATABRICKS_TOKEN,
base_url="https://<ai-gateway-url>/openai/v1"
)

response = client.responses.create(
model="<ai-gateway-endpoint>",
max_output_tokens=256,
input=[
{
"role": "user",
"content": [{"type": "input_text", "text": "Hello!"}]
},
{
"role": "assistant",
"content": [{"type": "output_text", "text": "Hello! How can I assist you today?"}]
},
{
"role": "user",
"content": [{"type": "input_text", "text": "What is Databricks?"}]
}
]
)

print(response.output)

Replace <ai-gateway-url> with your Unity AI Gateway URL and <ai-gateway-endpoint> with your Unity AI Gateway endpoint name.

Anthropic Messages API

Anthropic Messages API

Python
import anthropic
import os

DATABRICKS_TOKEN = os.environ.get('DATABRICKS_TOKEN')

client = anthropic.Anthropic(
api_key="unused",
base_url="https://<ai-gateway-url>/anthropic",
default_headers={
"Authorization": f"Bearer {DATABRICKS_TOKEN}",
},
)

message = client.messages.create(
model="<ai-gateway-endpoint>",
max_tokens=256,
messages=[
{"role": "user", "content": "Hello!"},
{"role": "assistant", "content": "Hello! How can I assist you today?"},
{"role": "user", "content": "What is Databricks?"},
],
)

print(message.content[0].text)

Replace <ai-gateway-url> with your Unity AI Gateway URL and <ai-gateway-endpoint> with your Unity AI Gateway endpoint name.

Google Gemini API

Google Gemini API

Python
from google import genai
from google.genai import types
import os

DATABRICKS_TOKEN = os.environ.get('DATABRICKS_TOKEN')

client = genai.Client(
api_key="databricks",
http_options=types.HttpOptions(
base_url="https://<ai-gateway-url>/gemini",
headers={
"Authorization": f"Bearer {DATABRICKS_TOKEN}",
},
),
)

response = client.models.generate_content(
model="<ai-gateway-endpoint>",
contents=[
types.Content(
role="user",
parts=[types.Part(text="Hello!")],
),
types.Content(
role="model",
parts=[types.Part(text="Hello! How can I assist you today?")],
),
types.Content(
role="user",
parts=[types.Part(text="What is Databricks?")],
),
],
config=types.GenerateContentConfig(
max_output_tokens=256,
),
)

print(response.text)

Replace <ai-gateway-url> with your Unity AI Gateway URL and <ai-gateway-endpoint> with your Unity AI Gateway endpoint name.

Next steps