Use trusted assets in AI/BI Genie spaces

Preview

This feature is in Public Preview. Workspace admins can enable this feature from the Previews page. See Manage Databricks Previews.

This article defines trusted assets and explains how to use them to provide verified answers in a Genie space.

What are trusted assets?

Trusted assets are user-defined table functions that you can write to answer specific questions that you anticipate from users. You can use them to define validated, parameterized SQL functions that act as recipes to answer common questions. When a user submits a question that invokes a trusted asset, it’s indicated in the response, adding an extra layer of assurance to the accuracy of the results.

Sample question answered with a trusted asset.

Note

Trusted assets are not a substitute for all other instructions. Databricks recommends using trusted assets for well-established recurring questions. They provide exact answers to specific questions and are not reused by Genie to address adjacent questions.

Why create trusted assets?

When using any AI tool, users should evaluate the accuracy of generated responses. Typically, they do this by considering whether the answer makes sense and effectively addresses their question. With Genie, a response is delivered as a table of results. Users can review the generated SQL that creates the result set, but non-technical users might not have the background to interpret the SQL statement or assess the correctness of the answer. Trusted assets help reduce the likelihood of these users encountering responses that are misleading, incorrect, or difficult to interpret.

When a user receives a response that uses a trusted asset, they can be confident that a domain expert has reviewed the SQL statement populating the results.

What’s the difference between trusted assets and example SQL queries?

Trusted assets provide verified answers to questions you expect Genie space users to ask. When a trusted asset can answer a user’s question, the user-defined table function stored as a trusted asset runs and returns the specified result set. That way, you know that your users are receiving correct information that answers their questions, and business users are provided the additional assurance that they’re receiving a vetted answer. Genie does not consider the SQL content of your trusted asset when answering questions.

Example SQL queries offer another option for setting up a Genie space. They provide context to teach Genie how to process other questions and write the SQL statements it generates. The exact examples provided in the Example SQL Queries section of an instruction set are not necessarily returned when a business user asks a related question.

Define a trusted asset

Defining a trusted asset starts with identifying a likely question. For example, suppose you’re working with a sales pipeline dataset, and a common question that a sales manager might ask is, “What are the open sales opportunities in my pipeline?”

The following steps outline the steps to create a trusted asset that answers this question:

  1. Define and test a SQL query that answers the question.

    This query joins two tables and returns a dataset of open opportunities listed in the “Pipeline” forecast category. In this step, the goal is to write a basic query that returns the expected results.

    SELECT
      o.id AS `OppId`,
      a.region__c AS `Region`,
      o.name AS `Opportunity Name`,
      o.forecastcategory AS `Forecast Category`,
      o.stagename,
      o.closedate AS `Close Date`,
      o.amount AS `Opp Amount`
    FROM
    users.user_name.opportunity o
    JOIN catalog.schema.accounts a ON o.accountid = a.id
    WHERE
    o.forecastcategory = 'Pipeline'
    AND o.stagename NOT LIKE '%closed%';
    
  2. Define a Unity Catalog function.

    Your Unity Catalog function should parameterize the query and produce results matching the specific conditions you expect the user to ask about. Suppose the sales manager wants to narrow the result set by focusing on a particular region or group of regions.

    The following example defines a Unity Catalog function that takes a list of regions as a parameter and returns a table. The function return is almost identical to the SQL statement in the previous step, except the WHERE clause has been modified to filter the results by region if a region has been provided. The comments provided in the function definitions are critical for instructing the Genie space on when and how to invoke this function.

    • Parameter comments: The open_opps_in_region function expects an array of strings as a parameter. The comment includes an example of the expected input. If no parameter is supplied, the default value is NULL. See Tips for writing functions for more on including optional parameters and comments.

    • Function comments: The comment in the SQL table function provides a detailed explanation of what the function does. This is critical because it informs Genie when to use the function as a response to user questions. The comment should describe the function’s purpose as precisely as possible. This information guides Genie in recognizing the function’s relevance to specific questions.

    CREATE
    OR REPLACE FUNCTION users.user_name.open_opps_in_region (
      regions ARRAY < STRING >
      COMMENT 'List of regions.  Example: ["APAC", "EMEA"]' DEFAULT NULL
    ) RETURNS TABLE
    COMMENT 'Addresses questions about the pipeline in the specified regions by returning
     a list of all the open opportunities. If no region is specified, returns all open opportunities.
     Example questions: "What is the pipeline for APAC and EMEA?", "Open opportunities in
     APAC"'
    RETURN
      SELECT
      o.id AS `OppId`,
      a.region__c AS `Region`,
      o.name AS `Opportunity Name`,
      o.forecastcategory AS `Forecast Category`,
      o.stagename,
      o.closedate AS `Close Date`,
      o.amount AS `Opp Amount`
      FROM
      catalog.schema.opportunity o
      JOIN catalog.schema.accounts a ON o.accountid = a.id
      WHERE
      o.forecastcategory = 'Pipeline'
      AND o.stagename NOT LIKE '%closed%'
      AND (
        isnull(open_opps_in_region.regions)
        OR array_contains(open_opps_in_region.regions, region__c)
      );
    

    When you run the code to create a function, it’s registered to the currently active schema by default. See Custom SQL functions in Unity Catalog. See Create a SQL table function for syntax and examples.

  3. Add a trusted asset.

    After the function is created in Unity Catalog, a user with at least CAN EDIT permission on the Genie space can add it from the Instructions tab of the Genie space.

Required permissions

Users with at least CAN EDIT permission on a Genie space can add or remove trusted assets.

Genie space users must have CAN USE permission on the catalog and schema that contains the function. To invoke a trusted asset, they must have EXECUTE permission on the function in Unity Catalog. Unity Catalog securable objects inherit permissions from their parent containers. See Securable objects in Unity Catalog.

To simplify sharing in a Genie space, Databricks recommends creating a dedicated schema to contain all the functions you want to use in your Genie space.

Tips for writing functions

Review the following examples to learn how to create dynamic functions for trusted assets.

Include a default parameter value

You can specify a default value for a parameter. Use the DEFAULT clause in the function signature as shown in the following example:

countries ARRAY<STRING> COMMENT 'List of countries' DEFAULT ARRAY()

Include example parameter values

For columns with a set enumeration of values, increase accuracy by defining them clearly in the comment. The following example provides a sample list of values:

regions ARRAY < STRING > COMMENT 'List of regions. Values: ["AF","AN","AS", "EU", "NA", "OC", "SA", NULL]'

Create an optional parameter

To create an optional parameter, set the default parameter to NULL as shown in the following example:

min_date STRING DEFAULT NULL

Specify formatting with comments

You can specify an exact format for a parameter by including it in a comment, as shown in the following example:

min_date STRING COMMENT 'minimum date (included) for a transaction, in `yyyy-mm-dd` format'

Explictly check for NULL values

If you include an optional parameter, one possible value you should expect is NULL. Because comparison with NULL can yield unpredictable results, you should explicitly build a check for NULL values into your function. The following example provides example syntax:

WHERE (isnull(min_date) OR created_date >= min_date)