Integrate Unity Catalog tools with third party generative AI frameworks
Unity Catalog AI agent tools can be used in popular gen AI libraries like LangChain, LlamaIndex, OpenAI, and Anthropic. These integrations combine Unity Catalog tool governance with the capabilities of third party agent authoring frameworks. For example:
- In LangChain, Unity Catalog functions can be part of an agent's workflow to perform tasks like querying or transforming data.
- In OpenAI or Anthropic integrations, the functions are called directly by the AI model during execution.
Select your framework in the following tabs to create a Unity Catalog tool and use it with that framework. Run the code in a Databricks notebook or Python script.
Requirements
- Install Python 3.10 or above.
- LangChain
- LlamaIndex
- OpenAI
- Anthropic
Use Databricks Unity Catalog to integrate SQL and Python functions as tools in LangChain and LangGraph workflows. This integration combines the governance of Unity Catalog with LangChain capabilities to build powerful LLM-based applications.
In this example, you create a Unity Catalog tool, test its functionality, and add it to an agent.
Install dependencies
Install Unity Catalog AI packages with the Databricks optional and install the LangChain integration package.
# Install the Unity Catalog AI integration package with the Databricks extra
%pip install unitycatalog-langchain[databricks]
# Install Databricks Langchain integration package
%pip install databricks-langchain
dbutils.library.restartPython()
Initialize the Databricks Function Client
Initialize the Databricks Function Client.
from unitycatalog.ai.core.base import get_uc_function_client
client = get_uc_function_client()
Define the tool's logic
Create a Unity Catalog function containing the tool's logic.
CATALOG = "my_catalog"
SCHEMA = "my_schema"
def add_numbers(number_1: float, number_2: float) -> float:
"""
A function that accepts two floating point numbers adds them,
and returns the resulting sum as a float.
Args:
number_1 (float): The first of the two numbers to add.
number_2 (float): The second of the two numbers to add.
Returns:
float: The sum of the two input numbers.
"""
return number_1 + number_2
function_info = client.create_python_function(
func=add_numbers,
catalog=CATALOG,
schema=SCHEMA,
replace=True
)
Test the function
Test your function to check it works as expected:
result = client.execute_function(
function_name=f"{CATALOG}.{SCHEMA}.add_numbers",
parameters={"number_1": 36939.0, "number_2": 8922.4}
)
result.value # OUTPUT: '45861.4'
Wrap the function using the UCFunctionToolKit
Wrap the function using the UCFunctionToolkit to make it accessible to agent authoring libraries. The toolkit ensures consistency across different libraries and adds helpful features like auto-tracing for retrievers.
from databricks_langchain import UCFunctionToolkit
# Create a toolkit with the Unity Catalog function
func_name = f"{CATALOG}.{SCHEMA}.add_numbers"
toolkit = UCFunctionToolkit(function_names=[func_name])
tools = toolkit.tools
Use the tool in an agent
Add the tool to a LangChain agent using the tools property from UCFunctionToolkit.
This example authors a simple agent using LangChain's AgentExecutor API for simplicity. For production workloads, use the agent authoring workflow seen in Author an AI agent and deploy it on Databricks Apps.
from langchain.agents import AgentExecutor, create_tool_calling_agent
from langchain.prompts import ChatPromptTemplate
from databricks_langchain import (
ChatDatabricks,
UCFunctionToolkit,
)
import mlflow
# Initialize the LLM (replace with your LLM of choice, if desired)
LLM_ENDPOINT_NAME = "databricks-meta-llama-3-3-70b-instruct"
llm = ChatDatabricks(endpoint=LLM_ENDPOINT_NAME, temperature=0.1)
# Define the prompt
prompt = ChatPromptTemplate.from_messages(
[
(
"system",
"You are a helpful assistant. Make sure to use tools for additional functionality.",
),
("placeholder", "{chat_history}"),
("human", "{input}"),
("placeholder", "{agent_scratchpad}"),
]
)
# Enable automatic tracing
mlflow.langchain.autolog()
# Define the agent, specifying the tools from the toolkit above
agent = create_tool_calling_agent(llm, tools, prompt)
# Create the agent executor
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)
agent_executor.invoke({"input": "What is 36939.0 + 8922.4?"})
Use Databricks Unity Catalog to integrate SQL and Python functions as tools in LlamaIndex workflows. This integration combines Unity Catalog governance with LlamaIndex's capabilities to index and query large datasets for LLMs.
-
Install the Databricks Unity Catalog integration package for LlamaIndex.
Python%pip install unitycatalog-llamaindex[databricks]
dbutils.library.restartPython() -
Create an instance of the Unity Catalog functions client.
Pythonfrom unitycatalog.ai.core.base import get_uc_function_client
client = get_uc_function_client() -
Create a Unity Catalog function written in Python.
PythonCATALOG = "your_catalog"
SCHEMA = "your_schema"
func_name = f"{CATALOG}.{SCHEMA}.code_function"
def code_function(code: str) -> str:
"""
Runs Python code.
Args:
code (str): The Python code to run.
Returns:
str: The result of running the Python code.
"""
import sys
from io import StringIO
stdout = StringIO()
sys.stdout = stdout
exec(code)
return stdout.getvalue()
client.create_python_function(
func=code_function,
catalog=CATALOG,
schema=SCHEMA,
replace=True
) -
Create an instance of the Unity Catalog function as a toolkit, and run it to verify that the tool behaves properly.
Pythonfrom unitycatalog.ai.llama_index.toolkit import UCFunctionToolkit
import mlflow
# Enable traces
mlflow.llama_index.autolog()
# Create a UCFunctionToolkit that includes the UC function
toolkit = UCFunctionToolkit(function_names=[func_name])
# Fetch the tools stored in the toolkit
tools = toolkit.tools
python_exec_tool = tools[0]
# Run the tool directly
result = python_exec_tool.call(code="print(1 + 1)")
print(result) # Outputs: {"format": "SCALAR", "value": "2\n"} -
Use the tool in a LlamaIndex ReActAgent by defining the Unity Catalog function as part of a LlamaIndex tool collection. Then verify that the agent behaves properly by calling the LlamaIndex tool collection.
Pythonfrom llama_index.llms.openai import OpenAI
from llama_index.core.agent import ReActAgent
llm = OpenAI()
agent = ReActAgent.from_tools(tools, llm=llm, verbose=True)
agent.chat("Please run the following python code: `print(1 + 1)`")
Use Databricks Unity Catalog to integrate SQL and Python functions as tools in OpenAI workflows. This integration combines the governance of Unity Catalog with OpenAI to create powerful gen AI apps.
-
Install the Databricks Unity Catalog integration package for OpenAI.
Python%pip install unitycatalog-openai[databricks]
%pip install mlflow -U
dbutils.library.restartPython() -
Create an instance of the Unity Catalog functions client.
Pythonfrom unitycatalog.ai.core.base import get_uc_function_client
client = get_uc_function_client() -
Create a Unity Catalog function written in Python.
PythonCATALOG = "your_catalog"
SCHEMA = "your_schema"
func_name = f"{CATALOG}.{SCHEMA}.code_function"
def code_function(code: str) -> str:
"""
Runs Python code.
Args:
code (str): The python code to run.
Returns:
str: The result of running the Python code.
"""
import sys
from io import StringIO
stdout = StringIO()
sys.stdout = stdout
exec(code)
return stdout.getvalue()
client.create_python_function(
func=code_function,
catalog=CATALOG,
schema=SCHEMA,
replace=True
) -
Create an instance of the Unity Catalog function as a toolkit and verify that the tool behaves properly by running the function.
Pythonfrom unitycatalog.ai.openai.toolkit import UCFunctionToolkit
import mlflow
# Enable tracing
mlflow.openai.autolog()
# Create a UCFunctionToolkit that includes the UC function
toolkit = UCFunctionToolkit(function_names=[func_name])
# Fetch the tools stored in the toolkit
tools = toolkit.tools
client.execute_function = tools[0] -
Submit the request to the OpenAI model along with the tools.
Pythonimport openai
messages = [
{
"role": "system",
"content": "You are a helpful customer support assistant. Use the supplied tools to assist the user.",
},
{"role": "user", "content": "What is the result of 2**10?"},
]
response = openai.chat.completions.create(
model="gpt-4o-mini",
messages=messages,
tools=tools,
)
# check the model response
print(response) -
After OpenAI returns a response, invoke the Unity Catalog function call to generate the response answer back to OpenAI.
Pythonimport json
# OpenAI sends only a single request per tool call
tool_call = response.choices[0].message.tool_calls[0]
# Extract arguments that the Unity Catalog function needs to run
arguments = json.loads(tool_call.function.arguments)
# Run the function based on the arguments
result = client.execute_function(func_name, arguments)
print(result.value) -
Once the answer has been returned, you can construct the response payload for subsequent calls to OpenAI.
Python# Create a message containing the result of the function call
function_call_result_message = {
"role": "tool",
"content": json.dumps({"content": result.value}),
"tool_call_id": tool_call.id,
}
assistant_message = response.choices[0].message.to_dict()
completion_payload = {
"model": "gpt-4o-mini",
"messages": [*messages, assistant_message, function_call_result_message],
}
# Generate final response
openai.chat.completions.create(
model=completion_payload["model"], messages=completion_payload["messages"]
)
Utilities
To simplify the process of crafting the tool response, the ucai-openai package has a utility, generate_tool_call_messages, that converts OpenAI ChatCompletion response messages so that they can be used for response generation.
from unitycatalog.ai.openai.utils import generate_tool_call_messages
messages = generate_tool_call_messages(response=response, client=client)
print(messages)
If the response contains multiple choice entries, you can pass the choice_index argument when calling generate_tool_call_messages to choose which choice entry to utilize. There is currently no support for processing multiple choice entries.
Use Databricks Unity Catalog to integrate SQL and Python functions as tools in Anthropic SDK LLM calls. This integration combines the governance of Unity Catalog with Anthropic models to create powerful gen AI apps.
The Anthropic integration requires Databricks Runtime 15.0 and above.
-
Install the Databricks Unity Catalog integration package for Anthropic.
Python%pip install unitycatalog-anthropic[databricks]
dbutils.library.restartPython() -
Create an instance of the Unity Catalog functions client.
Pythonfrom unitycatalog.ai.core.base import get_uc_function_client
client = get_uc_function_client() -
Create a Unity Catalog function written in Python.
PythonCATALOG = "your_catalog"
SCHEMA = "your_schema"
func_name = f"{CATALOG}.{SCHEMA}.weather_function"
def weather_function(location: str) -> str:
"""
Fetches the current weather from a given location in degrees Celsius.
Args:
location (str): The location to fetch the current weather from.
Returns:
str: The current temperature for the location provided in Celsius.
"""
return f"The current temperature for {location} is 24.5 celsius"
client.create_python_function(
func=weather_function,
catalog=CATALOG,
schema=SCHEMA,
replace=True
) -
Create an instance of the Unity Catalog function as a toolkit.
Pythonfrom unitycatalog.ai.anthropic.toolkit import UCFunctionToolkit
# Create an instance of the toolkit
toolkit = UCFunctionToolkit(function_names=[func_name], client=client) -
Use a tool call in Anthropic.
Pythonimport anthropic
# Initialize the Anthropic client with your API key
anthropic_client = anthropic.Anthropic(api_key="YOUR_ANTHROPIC_API_KEY")
# User's question
question = [{"role": "user", "content": "What's the weather in New York City?"}]
# Make the initial call to Anthropic
response = anthropic_client.messages.create(
model="claude-3-5-sonnet-20240620", # Specify the model
max_tokens=1024, # Use 'max_tokens' instead of 'max_tokens_to_sample'
tools=toolkit.tools,
messages=question # Provide the conversation history
)
# Print the response content
print(response) -
Construct a tool response. The response from the Claude model contains a tool request metadata block if a tool needs to be called.
Pythonfrom unitycatalog.ai.anthropic.utils import generate_tool_call_messages
# Call the UC function and construct the required formatted response
tool_messages = generate_tool_call_messages(
response=response,
client=client,
conversation_history=question
)
# Continue the conversation with Anthropic
tool_response = anthropic_client.messages.create(
model="claude-3-5-sonnet-20240620",
max_tokens=1024,
tools=toolkit.tools,
messages=tool_messages,
)
print(tool_response)
The unitycatalog.ai-anthropic package includes a message handler utility to simplify the parsing and handling of a call to the Unity Catalog function. The utility does the following:
- Detects tool calling requirements.
- Extracts tool calling information from the query.
- Performs the call to the Unity Catalog function.
- Parses the response from the Unity Catalog function.
- Craft the next message format to continue the conversation with Claude.
The entire conversation history must be provided in the conversation_history argument to the generate_tool_call_messages API. Claude models require the initialization of the conversation (the original user input question) and all subsequent LLM-generated responses and multi-turn tool call results.