Function calling on Databricks
Preview
This feature is in Public Preview and is supported on both Foundation Model APIs pay-per-token or provisioned throughput endpoints.
This article describes function calling and how to use it as part of your generative AI application workflows. Databricks Function Calling is OpenAI-compatible and is only available during model serving as part of Foundation Model APIs.
What is function calling
Function calling provides a way for you to control the output of LLMs, so they generate structured responses more reliably. When you use a function call, you describe functions in the API call by describing the function arguments using a JSON schema. The LLM itself does not call these functions, but instead it creates a JSON object that users can use to call the functions in their code.
For function calling on Databricks, the basic sequence of steps are as follows:
Call the model using the submitted query and a set of functions defined in the
tools
parameter.The model decides whether or not to call the defined functions. When the function is called, the content is a JSON object of strings that adheres to your custom schema.
Parse the strings into JSON in your code, and call your function with the provided arguments if they exist.
Call the model again by appending the structured response as a new message. The structure of the response is defined by the functions you previously provided in
tools
. From here, the model summarizes the results and sends that summary to the user.
When to use function calling
The following are example use cases for function calling:
Create assistants that can answer questions by calling other APIs. For example, you can define functions like
send_email(to: string, body: string)
orcurrent_weather(location: string, unit: 'celsius' | 'fahrenheit')
.Define and use API calls based on natural language. Like taking the statement, “Who are my top customers?” and making that into an API call named,
get_customers(min_revenue: int, created_before: string, limit: int)
and calling that API.
For batch inference or data processing tasks, like converting unstructured data into structured data. Databricks recommends using structured outputs.
Supported models
The following table lists the supported models and which model serving feature makes each model available.
For models made available by Foundation Model APIs, see Foundation Model APIs limits for region availability.
For models made available by External models, see Region availability for region availability.
Important
Meta Llama 3.1 is licensed under the LLAMA 3.1 Community License, Copyright © Meta Platforms, Inc. All Rights Reserved. Customers are responsible for ensuring compliance with applicable model licenses.
Model |
Made available using model serving feature |
Notes |
---|---|---|
Foundation Model APIs |
Supported on pay-per-token and provisioned throughput workloads. |
|
Foundation Model APIs |
Supported on pay-per-token and provisioned throughput workloads. |
|
Foundation Model APIs |
Supported on provisioned throughput workloads only. |
|
gpt-4o |
External models |
|
gpt-4o-2024-08-06 |
External models |
|
gpt-4o-2024-05-13 |
External models |
|
gpt-4o-mini |
External models |
Use function calling
To use function calling with your generative AI application, you must provide function parameters
and a description
.
The default behavior for tool_choice
is "auto"
. This lets the model decide which functions to call and whether to call them.
You can customize the default behavior depending on your use case. The following are your options:
Set
tool_choice: "required"
. In this scenario, the model always calls one or more functions. The model selects which function or functions to call.Set
tool_choice: {"type": "function", "function": {"name": "my_function"}}
. In this scenario, the model calls only a specific function.Set
tool_choice: "none"
to disable function calling and have the model only generate a user-facing message.
The following is a single turn example using the OpenAI SDK and its tools
parameter. See Chat task for additional syntax details.
Important
During Public Preview, function calling on Databricks is optimized for single turn function calling.
import os
import json
from openai import OpenAI
DATABRICKS_TOKEN = os.environ.get('YOUR_DATABRICKS_TOKEN')
DATABRICKS_BASE_URL = os.environ.get('YOUR_DATABRICKS_BASE_URL')
client = OpenAI(
api_key=DATABRICKS_TOKEN,
base_url=DATABRICKS_BASE_URL
)
tools = [
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA"
},
"unit": {
"type": "string",
"enum": [
"celsius",
"fahrenheit"
]
}
}
}
}
}
]
messages = [{"role": "user", "content": "What is the current temperature of Chicago?"}]
response = client.chat.completions.create(
model="databricks-meta-llama-3-1-70b-instruct",
messages=messages,
tools=tools,
tool_choice="auto",
)
print(json.dumps(response.choices[0].message.model_dump()['tool_calls'], indent=2))
JSON schema
Foundation Model APIs broadly support function definitions accepted by OpenAI. However, using a simpler JSON schema for function call definitions results in higher quality function call JSON generation. To promote higher quality generation, Foundation Model APIs only support a subset of JSON schema specifications.
The following function call definition keys are not supported:
Regular expressions using
pattern
.Complex nested or schema composition and validation using:
anyOf
,oneOf
,allOf
,prefixItems
, or$ref
.Lists of types except for the special case of
[type, “null”]
where one type in the list is a valid JSON type and the other is"null"
Additionally, the following limitations apply:
The maximum number of keys specified in the JSON schema is
16
.Foundation Model APIs does not enforce length or size constraints for objects and arrays.
This includes keywords like
maxProperties
,minProperties
, andmaxLength
.
Heavily nested JSON schemas will result in lower quality generation. If possible, try flattening the JSON schema for better results.
Token usage
Prompt injection and other techniques are used to enhance the quality of tool calls. Doing so impacts the number of input and output tokens consumed by the model, which in turn results in billing implications. The more tools you use, the more your input tokens increase.
Limitations
The following are limitations for function calling during Public Preview:
The current function calling solution is optimized for single turn function calls. Multi-turn function calling is supported during the preview, but is under development.
Parallel function calling is not supported.
The maximum number of functions that can be defined in
tools
is 32 functions.For provisioned throughput support, function calling is only supported on new endpoints. You cannot add function calling to previously created endpoints.