ai_query function

Applies to: check marked yes Databricks SQL


This feature is in Public Preview.

Invokes an existing Databricks Model Serving endpoint and parses and returns its response.



To query an endpoint that serves an external model or a foundation model:

ai_query(endpointName, request)

To query a custom model serving endpoint:

ai_query(endpointName, request, returnType)


  • endpointName: A STRING literal, the name of the existing Mosaic AI Model Serving endpoint in the same workspace for invocations. The definer must have CAN QUERY permission on the endpoint.

  • request: An expression, the request used to invoke the endpoint.

    • If the endpoint is an external model serving endpoint or Databricks Foundation Model APIs endpoint, the request must be a STRING.

    • If the endpoint is a custom model serving endpoint, the request can be a single column or a struct expression. The struct field names should match the input feature names expected by the endpoint.

  • returnType: An expression, the expected returnType from the endpoint. This is similar to the schema parameter in from_json function, which accepts both A STRING expression or invocation of schema_of_json function. Required for querying a custom Model Serving Endpoint.


The parsed response from the endpoint.


To query an External Model Serving Endpoint or Databricks Foundation Model:

> SELECT ai_query(
    'Describe Databricks SQL in 30 words.'
  ) AS summary

  "Databricks SQL is a cloud-based platform for data analytics and machine learning, providing a unified workspace for collaborative data exploration, analysis, and visualization using SQL queries."

> CREATE FUNCTION correct_grammar(text STRING)
  RETURN ai_query(
    CONCAT('Correct this to standard English:\n', text));
> GRANT EXECUTE ON correct_grammar TO ds;
- DS fixes grammar issues in a batch.
    * EXCEPT text,
    correct_grammar(text) AS text
  FROM articles;

To query a custom Model Serving Endpoint:

> SELECT text, ai_query(
    endpoint => 'spam-classification-endpoint',
    request => named_struct(
      'timestamp', timestamp,
      'sender', from_number,
      'text', text),
    returnType => 'BOOLEAN') AS is_spam
  FROM messages

> SELECT ai_query(
    request => struct(*),
    returnType => 'FLOAT') AS predicted_revenue
  FROM retail_revenue

> SELECT ai_query(
    request => named_struct("messages",
        ARRAY(named_struct("role", "user", "content", "What is ML?"))),
    returnType => 'STRUCT<candidates:ARRAY<STRING>>')

  {"candidates":["ML stands for Machine Learning. It's a subfield of Artificial Intelligence that involves the use of algorithms and statistical models to enable machines to learn from data, make decisions, and improve their performance on a specific task over time."]}