Skip to main content

Use ai_query

Preview

This feature is in Public Preview.

ai_query is a general-purpose AI Function that lets you query any supported AI model directly from SQL or Python. Unlike task-specific AI Functions, which are purpose-built and optimized for a single task, ai_query gives you full control over the model, prompt, and parameters.

For full syntax and parameter reference, see ai_query function.

When to use ai_query

Databricks recommends starting with a task-specific AI Function when one matches your objective. Use ai_query when a task-specific function doesn't meet your needs. For example, when you need to:

  • Control the prompt, model parameters, or output format more precisely
  • Query a custom, fine-tuned, or external model
  • Need flexibility to further optimize for throughput or quality

Decision tree for task-specific AI functions and ai_query

Best practices

  • Use pre-deployed foundation models. Use Databricks-hosted foundation model endpoints (prefixed with databricks-), instead of provisioned throughput endpoints. These endpoints are fully managed and scale automatically without provisioning or configuration.
  • Select a model optimized for batch inference. Databricks optimizes specific models for high-throughput batch workloads. Using a non-optimized model can result in reduced throughput and longer job completion times. See Supported models for the full list of batch-optimized models.
  • Submit your full dataset in a single query. AI Functions automatically handle parallelization, retries, and scaling. Manually splitting data into small batches can reduce throughput.
  • Set failOnError to false for large workloads. This allows the job to complete and return error messages for failed rows, so you retain successful results without reprocessing the entire dataset.

Supported models

ai_query supports pre-deployed Databricks models, models you bring yourself, and external models.

The following table summarizes the supported model types, the associated models, and model serving endpoint configuration requirements for each.

Type

Description

Supported models

Requirements

Pre-deployed models

Databricks hosts these foundation models and offers preconfigured endpoints that you can query using ai_query. See Supported foundation models on Mosaic AI Model Serving for which models are supported for each Model Serving feature and their region availability.

These models are supported and optimized for getting started with batch inference and production workflows:

  • databricks-gpt-5-2
  • databricks-gpt-5-1
  • databricks-gpt-5
  • databricks-gpt-5-mini
  • databricks-gpt-5-nano
  • databricks-gemini-3-1-pro
  • databricks-gemini-3-1-flash-lite
  • databricks-gemini-3-pro
  • databricks-gemini-3-flash
  • databricks-gemini-2-5-pro
  • databricks-gemini-2-5-flash
  • databricks-claude-sonnet-4
  • databricks-gpt-oss-20b
  • databricks-gpt-oss-120b
  • databricks-gemma-3-12b
  • databricks-llama-4-maverick
  • databricks-meta-llama-3-3-70b-instruct
  • databricks-meta-llama-3-1-8b-instruct
  • databricks-qwen3-embedding-0-6b
  • databricks-gte-large-en

Other Databricks-hosted models are available for use with AI Functions, but are not recommended for batch inference production workflows at scale. These other models are made available for real-time inference using Foundation Model APIs pay-per-token.

Databricks Runtime 15.4 LTS or above is required to use this functionality. Requires no endpoint provisioning or configuration. Your use of these models is subject to the Applicable model developer terms and AI Functions region availability.

Bring your own model

You can bring your own models and query them using AI Functions. AI Functions offers flexibility so you can query models for real-time inference or batch inference scenarios.

Use ai_query with foundation models

The following example demonstrates how to use ai_query with a foundation model hosted by Databricks.

SQL

SELECT text, ai_query(
"databricks-meta-llama-3-3-70b-instruct",
"Summarize the given text comprehensively, covering key points and main ideas concisely while retaining relevant details and examples. Ensure clarity and accuracy without unnecessary repetition or omissions: " || text
) AS summary
FROM uc_catalog.schema.table;

Example notebook: Batch inference and structured data extraction

The following example notebook demonstrates how to perform basic structured data extraction using ai_query to transform raw, unstructured data into organized, usable information through automated extraction techniques. This notebook also shows how to leverage Agent Evaluation to evaluate the accuracy using ground truth data.

Batch inference and structured data extraction notebook

Open notebook in new tab

Use ai_query with traditional ML models

ai_query supports traditional ML models, including fully custom ones. These models must be deployed on Model Serving endpoints. For syntax details and parameters, see ai_query function.

SQL
SELECT text, ai_query(
endpoint => "spam-classification",
request => named_struct(
"timestamp", timestamp,
"sender", from_number,
"text", text),
returnType => "BOOLEAN") AS is_spam
FROM catalog.schema.inbox_messages
LIMIT 10

Example notebook: Batch inference using BERT for named entity recognition

The following notebook shows a traditional ML model batch inference example using BERT.

Batch inference using BERT for named entity recognition notebook

Open notebook in new tab

Use ai_query in Python workflows

ai_query can be integrated into existing Python workflows.

The following writes the output of ai_query to an output table:

Python

df_out = df.selectExpr(
"ai_query('databricks-meta-llama-3-3-70b-instruct', CONCAT('Please provide a summary of the following text: ', text), modelParameters => named_struct('max_tokens', 100, 'temperature', 0.7)) as summary"
)
df_out.write.mode("overwrite").saveAsTable('output_table')