Query model services with the OpenAI Responses API

Beta

This feature is in Beta. Account admins can control access to this feature from the account console Previews page. See Manage Databricks previews.

important

The Responses API is only compatible with OpenAI foundation models. For a unified API that works across all providers, use the Chat Completions API.

The OpenAI Responses API is an alternative to the Chat Completions API that provides additional features for OpenAI models, including custom tools and multi-step workflows.

Requirements

See Requirements.
Install the appropriate package to your cluster based on the querying client option you choose.

Query examples

The examples in this section show how to query a model service using the OpenAI Responses API.

Python (DatabricksOpenAI)
Python (OpenAI)
REST API

To use the OpenAI Responses API, specify the model service's fully qualified name as the model input.

Python
from databricks_openai import DatabricksOpenAI

client = DatabricksOpenAI()

response = client.responses.create(
    model="system.ai.gpt-5",
    input=[
      {
        "role": "system",
        "content": "You are a helpful assistant."
      },
      {
        "role": "user",
        "content": "What is a mixture of experts model?",
      }
    ],
    max_output_tokens=256
)

To query foundation models outside your workspace, you must use the OpenAI client directly. You also need your Databricks workspace instance to connect the OpenAI client to Databricks. The following example assumes you have a Databricks API token and openai installed on your compute.

Python
import os
import openai
from openai import OpenAI

client = OpenAI(
    api_key=os.environ.get('DATABRICKS_TOKEN'),
    base_url="https://<workspace-url>/ai-gateway/mlflow/v1"
)

response = client.responses.create(
    model="system.ai.gpt-5",
    input=[
      {
        "role": "system",
        "content": "You are a helpful assistant."
      },
      {
        "role": "user",
        "content": "What is a mixture of experts model?",
      }
    ],
    max_output_tokens=256
)

Bash
curl \
-u token:$DATABRICKS_TOKEN \
-X POST \
-H "Content-Type: application/json" \
-d '{
  "model": "system.ai.gpt-5",
  "input": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "What is a mixture of experts model?"
    }
  ],
  "max_output_tokens": 256
}' \
https://<workspace-url>/ai-gateway/mlflow/v1/responses

Custom tools

Custom tools allow the model to return arbitrary string output instead of JSON-formatted function arguments. This is useful for code generation, applying patches, or other use cases where structured JSON is not required.

note

Custom tools are only supported with GPT-5 series models (databricks-gpt-5, databricks-gpt-5-1, databricks-gpt-5-2, databricks-gpt-5-4, databricks-gpt-5-5, databricks-gpt-5-5-pro) through the Responses API.

Python
from databricks_openai import DatabricksOpenAI

client = DatabricksOpenAI()

response = client.responses.create(
    model="system.ai.gpt-5",
    input=[{"role": "user", "content": "Write a Python function to calculate factorial"}],
    tools=[
        {
            "type": "custom",
            "name": "code_exec",
            "description": "Executes arbitrary Python code. Return only valid Python code."
        }
    ],
    max_output_tokens=1024
)

Built-in tools

Built-in tools allow the model to call platform-provided capabilities without requiring you to implement the tool backend yourself. These tools return structured outputs and are fully managed by the platform.

Python
from databricks_openai import DatabricksOpenAI

client = DatabricksOpenAI()

response = client.responses.create(
    model="system.ai.gpt-5",
    input=[{
        "role": "user",
        "content": "Add input validation to the factorial function in main.py."
    }],
    tools=[
        {
            "type": "apply_patch"
        }
    ],
    max_output_tokens=1024
)

print(response.output_text)

Supported models

Databricks-hosted foundation models

databricks-gpt-5-5-pro
databricks-gpt-5-5
databricks-gpt-5-4
databricks-gpt-5-4-mini
databricks-gpt-5-4-nano
databricks-gpt-5-3-codex
databricks-gpt-5-2
databricks-gpt-5-2-codex
databricks-gpt-5-1
databricks-gpt-5-1-codex-max
databricks-gpt-5-1-codex-mini
databricks-gpt-5
databricks-gpt-5-mini
databricks-gpt-5-nano

Supported input types

OpenAI GPT models on Databricks accept text and image inputs. See Query foundation models by type for image format and size requirements. For per-model input types, see Databricks-hosted foundation models available in Foundation Model APIs.

Limitations

The following limitations apply to Databricks-hosted foundation models:

The following parameters are not supported and return a 400 error if specified:

background — Background processing is not supported.
store — Stored responses is not supported.
previous_response_id — Stored responses is not supported.
service_tier — Service tier selection is managed by Databricks.

The following tool types are supported for pay-per-token foundation models:

function — Traditional structured function calling
custom — Custom user-defined tools
apply_patch — Code patching operations
shell — Shell command execution
image_generation — Image generation
mcp — Model Context Protocol tools
web_search — Web search

Requirements​

Query examples​

Custom tools​

Built-in tools​

Supported models​

Databricks-hosted foundation models​

Supported input types​

Limitations​

Additional resources​