Use ai_query
This feature is in Public Preview.
ai_query is a general-purpose AI Function that lets you query any supported AI model directly from SQL or Python. Unlike task-specific AI Functions, which are purpose-built and optimized for a single task, ai_query gives you full control over the model, prompt, and parameters.
For full syntax and parameter reference, see ai_query function.
When to use ai_query
Databricks recommends starting with a task-specific AI Function when one matches your objective. Use ai_query when a task-specific function doesn't meet your needs. For example, when you need to:
- Control the prompt, model parameters, or output format more precisely
- Query a custom, fine-tuned, or external model
- Need flexibility to further optimize for throughput or quality
Best practices
- Use pre-deployed foundation models. Use Databricks-hosted foundation model endpoints (prefixed with
databricks-), instead of provisioned throughput endpoints. These endpoints are fully managed and scale automatically without provisioning or configuration. - Select a model optimized for batch inference. Databricks optimizes specific models for high-throughput batch workloads. Using a non-optimized model can result in reduced throughput and longer job completion times. See Supported models for the full list of batch-optimized models.
- Submit your full dataset in a single query. AI Functions automatically handle parallelization, retries, and scaling. Manually splitting data into small batches can reduce throughput.
- Set
failOnErrortofalsefor large workloads. This allows the job to complete and return error messages for failed rows, so you retain successful results without reprocessing the entire dataset.
Supported models
ai_query supports pre-deployed Databricks models, models you bring yourself, and external models.
The following table summarizes the supported model types, the associated models, and model serving endpoint configuration requirements for each.
Type | Description | Supported models | Requirements |
|---|---|---|---|
Pre-deployed models | Databricks hosts these foundation models and offers preconfigured endpoints that you can query using | These models are supported and optimized for getting started with batch inference and production workflows:
Other Databricks-hosted models are available for use with AI Functions, but are not recommended for batch inference production workflows at scale. These other models are made available for real-time inference using Foundation Model APIs pay-per-token. | Databricks Runtime 15.4 LTS or above is required to use this functionality. Requires no endpoint provisioning or configuration. Your use of these models is subject to the Applicable model developer terms and AI Functions region availability. |
Bring your own model | You can bring your own models and query them using AI Functions. AI Functions offers flexibility so you can query models for real-time inference or batch inference scenarios. |
|
|
Use ai_query with foundation models
The following example demonstrates how to use ai_query with a foundation model hosted by Databricks.
- See
ai_queryfunction for syntax details and parameters. - See Multimodal inputs for multimodal input query examples.
- See Examples for advanced scenarios for guidance on how to configure parameters for advanced use cases such as:
- Handle errors using
failOnError - Structured outputs on Databricks for how to specify structured output for your query responses.
- Handle errors using
SELECT text, ai_query(
"databricks-meta-llama-3-3-70b-instruct",
"Summarize the given text comprehensively, covering key points and main ideas concisely while retaining relevant details and examples. Ensure clarity and accuracy without unnecessary repetition or omissions: " || text
) AS summary
FROM uc_catalog.schema.table;
Example notebook: Batch inference and structured data extraction
The following example notebook demonstrates how to perform basic structured data extraction using ai_query to transform raw, unstructured data into organized, usable information through automated extraction techniques. This notebook also shows how to leverage Agent Evaluation to evaluate the accuracy using ground truth data.
Batch inference and structured data extraction notebook
Use ai_query with traditional ML models
ai_query supports traditional ML models, including fully custom ones. These models must be deployed on Model Serving endpoints. For syntax details and parameters, see ai_query function.
SELECT text, ai_query(
endpoint => "spam-classification",
request => named_struct(
"timestamp", timestamp,
"sender", from_number,
"text", text),
returnType => "BOOLEAN") AS is_spam
FROM catalog.schema.inbox_messages
LIMIT 10
Example notebook: Batch inference using BERT for named entity recognition
The following notebook shows a traditional ML model batch inference example using BERT.
Batch inference using BERT for named entity recognition notebook
Use ai_query in Python workflows
ai_query can be integrated into existing Python workflows.
The following writes the output of ai_query to an output table:
df_out = df.selectExpr(
"ai_query('databricks-meta-llama-3-3-70b-instruct', CONCAT('Please provide a summary of the following text: ', text), modelParameters => named_struct('max_tokens', 100, 'temperature', 0.7)) as summary"
)
df_out.write.mode("overwrite").saveAsTable('output_table')