`ai_query` function

Applies to: Databricks SQL Databricks Runtime

Preview

This functionality is in Public Preview and HIPAA compliant.

During the preview:

The underlying language model can handle several languages, but this AI Function is tuned for English.
See Features with limited regional availability for AI Functions region availability.

Invokes an existing Databricks Model Serving endpoint and parses and returns its response.

ai_query is a general purpose AI Function that enables you to query existing endpoints for real-time inference or batch inference workloads.

See General purpose function: ai_query for supported models and required endpoint configurations.
You can also use ai_query to query an AI agent deployed on ML model serving endpoint, see Query a deployed Mosaic AI agent
To use ai_query in production workflows, see Deploy batch inference pipelines.

Requirements

This function is not available on Databricks SQL Classic.
You must enable AWS PrivateLink to use this feature on pro SQL warehouses.
Databricks Runtime 15.4 LTS or above is recommended. Using Databricks Runtime 15.3 or below might result in slower performance speeds.
Databricks Runtime 15.4 LTS or above is required for batch inference scenarios.
Your workspace must be in a supported Model Serving region.
An existing model serving endpoint with your model loaded. If you are using a Databricks hosted foundation model, an endpoint is created for you. Otherwise, see Create custom model serving endpoints or Create foundation model serving endpoints.
Querying Foundation Model APIs is enabled by default. To query endpoints that serve custom models or external models:
- Enable AI_Query for Custom Models and External Models in the Databricks Previews UI.
The current Lakeflow Spark Declarative Pipelines warehouse channel does not use the latest Databricks Runtime version that supports ai_query(). Set the pipelines.channel in the table properties as 'preview' to use ai_query().

SQL
> create or replace materialized view
    ai_query_mv
    TBLPROPERTIES('pipelines.channel' = 'PREVIEW') AS
  SELECT
    ai_query("databricks-meta-llama-3-3-70b-instruct", text) as response
  FROM
    messages
  LIMIT 10;

Syntax

To query an endpoint that serves a foundation model:

ai_query(endpoint, request)

To query a custom model serving endpoint with a model schema:

ai_query(endpoint, request)

To query a custom model serving endpoint without a model schema:

ai_query(endpoint, request, returnType, failOnError)

Arguments and returns

Argument	Description	Returns
`endpoint`	The name of a Databricks Foundation Model serving endpoint, an external model serving endpoint or a custom model endpoint in the same workspace for invocations as a `STRING` literal. The definer must have `CAN QUERY` permission on the endpoint.
`request`	The request used to invoke the endpoint as an expression. If the endpoint is an external model serving endpoint or Databricks Foundation Model APIs endpoint, the request must be a `STRING`. If the endpoint is a custom model serving endpoint, the request can be a single column or a STRUCT expression. The STRUCT field names should match the input feature names expected by the endpoint.
`returnType`	(Optional for Databricks Runtime 15.2 and above) The expected `returnType` from the endpoint as an expression. This is similar to the schema parameter in `from_json` function, which accepts both a `STRING` expression or invocation of `schema_of_json` function. For Databricks Runtime 15.2 and above, this expression is optional. If it is not provided, `ai_query()` automatically infers the return type from the model schema of the custom model serving endpoint. For Databricks Runtime 15.1 and below, this expression is required for querying a custom model serving endpoint. Use the `responseFormat` parameter to specify response formats for chat foundation models.
`failOnError`	(Optional) A boolean literal that defaults to true. Requires Databricks Runtime 15.3 or above. This flag indicates whether to include error status in the `ai_query` response. See Handle errors using `failOnError` for an example.	The following describes the return behavior based on the `failOnError` scenario: If `failOnError => true`, the function returns the same result as the existing behavior, which is the parsed response from the endpoint. The data type of the parsed response is inferred from the model type, the model schema endpoint, or the `returnType` parameter in the `ai_query` function. If `failOnError => false`, the function returns a `STRUCT` object that contains the parsed response and the error status string. If the inference of the row succeeds, the `errorStatus` field is `null`. If the inference of the row fails due to model endpoint errors, the `response` field is `null`. If the inference of the row fails due to other errors, the whole query fails.
`modelParameters`	(Optional) A struct field that contains chat, completion and embedding model parameters for serving foundation models or external models. These model parameters must be constant parameters and not data dependent. Requires Databricks Runtime 15.3 or above. When these model parameters are not specified or set to `null` the default value is used. With the exception of `temperature` which has a default value of `0.0`, the default values for these model parameters are the same as those listed in Foundation model REST API reference. See Configure a model by passing model parameters for an example.
`responseFormat`	(Optional) Specify the response format you want the chat foundation model to follow. Requires Databricks Runtime 15.4 LTS or above. Only available for querying chat foundation models. Two styles of response format are supported. A DDL style JSON string A JSON string. Three JSON string types of response format are supported:`text`, `json_object`, `json_schema` See Enforce output schema with structured output for examples.	The following describes the what happens when `failOnError` is also set when `responseFormat` is specified: If `failOnError => false` and you have specified `responseFormat`, the function returns the parsed response and the error status string as a `STRUCT` object. Depending on the JSON string type specified in `responseFormat`, the following response is returned: For `responseFormat => '{"type": "text"}'`, the response is a string such as, `“Here is the response”`. For `responseFormat => '{"type": "json_object"}'`, the response is a key-value pair JSON string, such as `{“key”: “value”}`. For `responseFormat => '{"type": "json_schema", "json_schema"...}'`, the response is a JSON string.
`files`	(Optional) Specify which files and content to use in your multimodal input requests using `files=>content`. `files` is the parameter name expected by the model for multimodal input, and `content` refers to the column in the DataFrame which contains the binary content of the image files you want to use in your query. Required for multimodal requests. Only image inputs: `JPEG` or `PNG` are supported. See Multimodal inputs for an example. As seen in this example, you can specify the `content` column of the output of `read_files` to the `files` parameter.

Example: Query a foundation model

To query an external model serving endpoint:

SQL
> SELECT ai_query(
    'my-external-model-openai-chat',
    'Describe Databricks SQL in 30 words.'
  ) AS summary

  "Databricks SQL is a cloud-based platform for data analytics and machine learning, providing a unified workspace for collaborative data exploration, analysis, and visualization using SQL queries."

To query a foundation model supported by Databricks Foundation Model APIs:

SQL
> SELECT *,
  ai_query(
    'databricks-meta-llama-3-3-70b-instruct',
    "Can you tell me the name of the US state that serves the provided ZIP code? zip code: " || pickup_zip
    )
  FROM samples.nyctaxi.trips
  LIMIT 10

Optionally, you can also wrap a call to ai_query() in a UDF for function calling as follows:

SQL
 CREATE FUNCTION correct_grammar(text STRING)
  RETURNS STRING
  RETURN ai_query(
    'databricks-meta-llama-3-3-70b-instruct',
    CONCAT('Correct this to standard English:\n', text));
> GRANT EXECUTE ON correct_grammar TO ds;
- DS fixes grammar issues in a batch.
> SELECT
    * EXCEPT text,
    correct_grammar(text) AS text
  FROM articles;

Multimodal inputs

ai_query natively supports multimodal image inputs. See Foundation model types for the supported Databricks-hosted vision models.

The following are supported input types:

JPEG
PNG

The following example shows how to query a foundation model supported by Databricks Foundation Model APIs for multimodal input. In this example, the files => content parameter is used to pass the image file data to ai_query

files: The parameter name expected by the model for multimodal input
content: The column in the DataFrame returned by READ_FILES, which contains the binary content of the image file.

SQL

> SELECT *, ai_query(
  'databricks-llama-4-maverick',
 'what is this image about?', files => content)
as output FROM READ_FILES("/Volumes/main/multimodal/unstructured/image.jpeg");

To query a foundation model supported by Databricks Foundation Model APIs for multimodal input and specify structured output:

SQL
> SELECT *, ai_query(
  'databricks-llama-4-maverick', 'What is interesting or important about this image?',
    responseFormat => ‘{
      "type": "json_schema",
        "json_schema": {
          "name": "output",
          "schema": {
            "type": "object",
            "properties": {
              "summary": {"type": "string"},
              "num_people": {"type": "integer"},
              "num_animals": {"type": "integer"},
              "interesting_fact": {"type": "string"},
              "possible_context": {"type": "string"}
            }
        },
        "strict": true
      }
    }’,
    files => content
  )
  as output FROM READ_FILES("/Volumes/main/user/volume1/image.jpeg");

Example: Query a traditional ML model

To query a custom model or a traditional ML model serving endpoint:

SQL

> SELECT text, ai_query(
    endpoint => 'spam-classification-endpoint',
    request => named_struct(
      'timestamp', timestamp,
      'sender', from_number,
      'text', text),
    returnType => 'BOOLEAN') AS is_spam
  FROM messages
  LIMIT 10

> SELECT ai_query(
    'weekly-forecast',
    request => struct(*),
    returnType => 'FLOAT') AS predicted_revenue
  FROM retail_revenue

Examples for advanced scenarios

The following sections provide examples for advanced use cases like error handling or how to incorporate ai_query into a user-defined function.

Pass a messages array

The following example shows how to pass a messages array to your model or agent application using ai_query.

SQL
> SELECT ai_query(
    'custom-llama-chat',
    request => named_struct("messages",
        ARRAY(named_struct("role", "user", "content", "What is ML?"))),
    returnType => 'STRUCT<candidates:ARRAY<STRING>>')

  {"candidates":["ML stands for Machine Learning. It's a subfield of Artificial Intelligence that involves the use of algorithms and statistical models to enable machines to learn from data, make decisions, and improve their performance on a specific task over time."]}

Concatenate the prompt and inference column

There are multiple ways to concatenate the prompt and the inference column, such as using ||, CONCAT(), or format_string():

SQL
SELECT
CONCAT('${prompt}', ${input_column_name}) AS concatenated_prompt
FROM ${input_table_name};

Alternatively:

SQL
SELECT
'${prompt}' || ${input_column_name} AS concatenated_prompt
FROM ${input_table_name};

Or using format_string():

SQL
SELECT
format_string('%s%s', '${prompt}', ${input_column_name}) AS concatenated_prompt
FROM ${input_table_name};

Configure a model by passing model parameters

Customize model behavior by passing specific parameters such as maximum tokens and temperature. For example:

SQL
SELECT text, ai_query(
    "databricks-meta-llama-3-3-70b-instruct",
    "Please summarize the following article: " || text,
    modelParameters => named_struct('max_tokens', 100, 'temperature', 0.7)
) AS summary
FROM uc_catalog.schema.table;

Handle errors using `failOnError`

Use the failOnError argument for ai_query to handle errors. The following example shows how to make sure that if one row has an error, it won't stop the whole query from running. See Arguments and returns for expected behaviors based on how this argument is set.

SQL

SELECT text, ai_query(
    "databricks-meta-llama-3-3-70b-instruct",
    "Summarize the given text comprehensively, covering key points and main ideas concisely while retaining relevant details and examples. Ensure clarity and accuracy without unnecessary repetition or omissions: " || text,
failOnError => false
) AS summary
FROM uc_catalog.schema.table;

Enforce output schema with structured output

Ensure that the output conforms to a specific schema for easier downstream processing using responseFormat. See Structured outputs on Databricks.

The following example enforces a DDL style JSON string schema:

SQL
SELECT ai_query(
    "databricks-gpt-oss-20b",
    "Extract research paper details from the following abstract: " || abstract,
    responseFormat => 'STRUCT<research_paper_extraction:STRUCT<title:STRING, authors:ARRAY<STRING>, abstract:STRING, keywords:ARRAY<STRING>>>'
)
FROM research_papers;

Alternatively, using a JSON schema response format:

SQL
SELECT ai_query(
    "databricks-gpt-oss-20b",
    "Extract research paper details from the following abstract: " || abstract,
    responseFormat => '{
      "type": "json_schema",
      "json_schema": {
        "name": "research_paper_extraction",
        "schema": {
          "type": "object",
          "properties": {
            "title": {"type": "string"},
            "authors": {"type": "array", "items": {"type": "string"}},
            "abstract": {"type": "string"},
            "keywords": {"type": "array", "items": {"type": "string"}}
          }
      },
      "strict": true
    }
  }'
)
FROM research_papers;

An expected output might look like:

SQL
{ "title": "Understanding AI Functions in Databricks", "authors": ["Alice Smith", "Bob Jones"], "abstract": "This paper explains how AI functions can be integrated into data workflows.", "keywords": ["Databricks", "AI", "LLM"] }

Use `ai_query` in user-defined functions

You can wrap a call to ai_query in a UDF, making it easy to use functions across different workflows and share them.

SQL
CREATE FUNCTION correct_grammar(text STRING)
  RETURNS STRING
  RETURN ai_query(
    'databricks-meta-llama-3-3-70b-instruct',
    CONCAT('Correct this to standard English:\n', text));

GRANT EXECUTE ON correct_grammar TO ds;

SELECT
    * EXCEPT text,
    correct_grammar(text) AS text
  FROM articles;

Requirements​

Syntax​

Arguments and returns​

Example: Query a foundation model​

Multimodal inputs​

Example: Query a traditional ML model​

Examples for advanced scenarios​

Pass a messages array​

Concatenate the prompt and inference column​

Configure a model by passing model parameters​

Handle errors using failOnError​

Enforce output schema with structured output​

Use ai_query in user-defined functions​