Skip to main content

Integrate Unity Catalog tools with third party generative AI frameworks

Unity Catalog AI agent tools can be used in popular gen AI libraries like LangChain, LlamaIndex, OpenAI, and Anthropic. These integrations combine Unity Catalog tool governance with the capabilities of third party agent authoring frameworks. For example:

  • In LangChain, Unity Catalog functions can be part of an agent's workflow to perform tasks like querying or transforming data.
  • In OpenAI or Anthropic integrations, the functions are called directly by the AI model during execution.

Select your framework in the following tabs to create a Unity Catalog tool and use it with that framework. Run the code in a Databricks notebook or Python script.

Requirements

  • Install Python 3.10 or above.

Use Databricks Unity Catalog to integrate SQL and Python functions as tools in LangChain and LangGraph workflows. This integration combines the governance of Unity Catalog with LangChain capabilities to build powerful LLM-based applications.

In this example, you create a Unity Catalog tool, test its functionality, and add it to an agent.

Install dependencies

Install Unity Catalog AI packages with the Databricks optional and install the LangChain integration package.

Python
# Install the Unity Catalog AI integration package with the Databricks extra
%pip install unitycatalog-langchain[databricks]

# Install Databricks Langchain integration package
%pip install databricks-langchain
dbutils.library.restartPython()

Initialize the Databricks Function Client

Initialize the Databricks Function Client.

Python
from unitycatalog.ai.core.base import get_uc_function_client

client = get_uc_function_client()

Define the tool's logic

Create a Unity Catalog function containing the tool's logic.

Python

CATALOG = "my_catalog"
SCHEMA = "my_schema"

def add_numbers(number_1: float, number_2: float) -> float:
"""
A function that accepts two floating point numbers adds them,
and returns the resulting sum as a float.

Args:
number_1 (float): The first of the two numbers to add.
number_2 (float): The second of the two numbers to add.

Returns:
float: The sum of the two input numbers.
"""
return number_1 + number_2

function_info = client.create_python_function(
func=add_numbers,
catalog=CATALOG,
schema=SCHEMA,
replace=True
)

Test the function

Test your function to check it works as expected:

Python
result = client.execute_function(
function_name=f"{CATALOG}.{SCHEMA}.add_numbers",
parameters={"number_1": 36939.0, "number_2": 8922.4}
)

result.value # OUTPUT: '45861.4'

Wrap the function using the UCFunctionToolKit

Wrap the function using the UCFunctionToolkit to make it accessible to agent authoring libraries. The toolkit ensures consistency across different libraries and adds helpful features like auto-tracing for retrievers.

Python
from databricks_langchain import UCFunctionToolkit

# Create a toolkit with the Unity Catalog function
func_name = f"{CATALOG}.{SCHEMA}.add_numbers"
toolkit = UCFunctionToolkit(function_names=[func_name])

tools = toolkit.tools

Use the tool in an agent

Add the tool to a LangChain agent using the tools property from UCFunctionToolkit.

This example authors a simple agent using LangChain's AgentExecutor API for simplicity. For production workloads, use the agent authoring workflow seen in Author an AI agent and deploy it on Databricks Apps.

Python
from langchain.agents import AgentExecutor, create_tool_calling_agent
from langchain.prompts import ChatPromptTemplate
from databricks_langchain import (
ChatDatabricks,
UCFunctionToolkit,
)
import mlflow

# Initialize the LLM (replace with your LLM of choice, if desired)
LLM_ENDPOINT_NAME = "databricks-meta-llama-3-3-70b-instruct"
llm = ChatDatabricks(endpoint=LLM_ENDPOINT_NAME, temperature=0.1)

# Define the prompt
prompt = ChatPromptTemplate.from_messages(
[
(
"system",
"You are a helpful assistant. Make sure to use tools for additional functionality.",
),
("placeholder", "{chat_history}"),
("human", "{input}"),
("placeholder", "{agent_scratchpad}"),
]
)

# Enable automatic tracing
mlflow.langchain.autolog()

# Define the agent, specifying the tools from the toolkit above
agent = create_tool_calling_agent(llm, tools, prompt)

# Create the agent executor
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)
agent_executor.invoke({"input": "What is 36939.0 + 8922.4?"})