Skip to main content

Integrate OpenAI with Databricks Unity Catalog tools

Use Databricks Unity Catalog to integrate SQL and Python functions as tools in OpenAI workflows. This integration combines the governance of Unity Catalog with OpenAI to create powerful gen AI apps.

Requirements

  • Install Python 3.10 or above.

Integrate Unity Catalog tools with OpenAI

Run the following code in a notebook or Python script to create a Unity Catalog tool and use while calling an OpenAI model.

  1. Install the Databricks Unity Catalog integration package for OpenAI.

    Python
    %pip install unitycatalog-openai[databricks]
    %pip install mlflow -U
    dbutils.library.restartPython()
  2. Create an instance of the Unity Catalog functions client.

    Python
    from unitycatalog.ai.core.base import get_uc_function_client

    client = get_uc_function_client()
  3. Create a Unity Catalog function written in Python.

    Python
    CATALOG = "your_catalog"
    SCHEMA = "your_schema"

    func_name = f"{CATALOG}.{SCHEMA}.code_function"

    def code_function(code: str) -> str:
    """
    Runs Python code.

    Args:
    code (str): The python code to run.
    Returns:
    str: The result of running the Python code.
    """
    import sys
    from io import StringIO
    stdout = StringIO()
    sys.stdout = stdout
    exec(code)
    return stdout.getvalue()

    client.create_python_function(
    func=code_function,
    catalog=CATALOG,
    schema=SCHEMA,
    replace=True
    )
  4. Create an instance of the Unity Catalog function as a toolkit and verify that the tool behaves properly by running the function.

    Python
    from unitycatalog.ai.openai.toolkit import UCFunctionToolkit
    import mlflow

    # Enable tracing
    mlflow.openai.autolog()

    # Create a UCFunctionToolkit that includes the UC function
    toolkit = UCFunctionToolkit(function_names=[func_name])

    # Fetch the tools stored in the toolkit
    tools = toolkit.tools
    client.execute_function = tools[0]
  5. Submit the request to the OpenAI model along with the tools.

    Python
    import openai

    messages = [
    {
    "role": "system",
    "content": "You are a helpful customer support assistant. Use the supplied tools to assist the user.",
    },
    {"role": "user", "content": "What is the result of 2**10?"},
    ]
    response = openai.chat.completions.create(
    model="gpt-4o-mini",
    messages=messages,
    tools=tools,
    )
    # check the model response
    print(response)
  6. After OpenAI returns a response, invoke the Unity Catalog function call to generate the response answer back to OpenAI.

    Python
    import json

    # OpenAI sends only a single request per tool call
    tool_call = response.choices[0].message.tool_calls[0]
    # Extract arguments that the Unity Catalog function needs to run
    arguments = json.loads(tool_call.function.arguments)

    # Run the function based on the arguments
    result = client.execute_function(func_name, arguments)
    print(result.value)
  7. Once the answer has been returned, you can construct the response payload for subsequent calls to OpenAI.

    Python
    # Create a message containing the result of the function call
    function_call_result_message = {
    "role": "tool",
    "content": json.dumps({"content": result.value}),
    "tool_call_id": tool_call.id,
    }
    assistant_message = response.choices[0].message.to_dict()
    completion_payload = {
    "model": "gpt-4o-mini",
    "messages": [*messages, assistant_message, function_call_result_message],
    }

    # Generate final response
    openai.chat.completions.create(
    model=completion_payload["model"], messages=completion_payload["messages"]
    )

Utilities

To simplify the process of crafting the tool response, the ucai-openai package has a utility, generate_tool_call_messages, that converts OpenAI ChatCompletion response messages so that they can be used for response generation.

Python
from unitycatalog.ai.openai.utils import generate_tool_call_messages

messages = generate_tool_call_messages(response=response, client=client)
print(messages)
note

If the response contains multiple choice entries, you can pass the choice_index argument when calling generate_tool_call_messages to choose which choice entry to utilize. There is currently no support for processing multiple choice entries.