Use trusted assets in AI/BI Genie spaces

Preview

This feature is in Public Preview. See Enable Genie spaces in your workspace.

This article defines trusted assets and explains how to use them to provide verified answers in a Genie space.

What are trusted assets?

Trusted assets are predefined functions and example queries meant to provide verified answers to questions that you anticipate from users. When a user submits a question that invokes a trusted asset, it’s indicated in the response, adding an extra layer of assurance to the accuracy of the results.

Trusted assets can include the following:

  • Parameterized example SQL queries: When a parameterized example SQL query is used to generate a response, the reponse is labeled a trusted asset. The response includes the values used as arguments in the query.

  • User-defined table functions (UDFs): You can define custom functions and register them with Unity Catalog. Then, you can add those functions as trusted assets when you’re setting up instructions in your Genie space. See Create a SQL table function and User-defined functions (UDFs) in Unity Catalog.

Sample question answered with a trusted asset.

Note

Trusted assets are not a substitute for all other instructions. Databricks recommends using trusted assets for well-established recurring questions. They provide exact answers to specific questions.

Why create trusted assets?

When using any AI tool, users should evaluate the accuracy of generated responses. Typically, they do this by considering whether the answer makes sense and effectively addresses their question. With Genie, a response is delivered as a table of results. Users can review the generated SQL that creates the result set, but non-technical users might not have the background to interpret the SQL statement or assess the correctness of the answer. Trusted assets help reduce the likelihood of these users encountering responses that are misleading, incorrect, or difficult to interpret.

When a user receives a response that is labeled as a trusted asset, they can be confident that a domain expert has reviewed the SQL statement populating the results.

What’s the difference between trusted assets and example SQL queries?

Trusted assets provide verified answers to questions you expect Genie space users to ask. When a trusted asset can answer a user’s question, the instruction you have stored as a trusted asset runs and returns the specified result set. All SQL Functions that you include in the instructions are treated as trusted assets. Example SQL Queries that include parameters can also be treated as trusted assets.

  • SQL Functions: You can write custom SQL functions tailored to handle your data and address company-specific questions. Genie does not consider the SQL content of your trusted assets when responding to questions.

  • Example SQL Queries (with parameters): When a parameterized example query’s exact text is used to generate a response, the response is automatically labeled as a trusted asset. If the exact text of the query is not used, or the example query does not use parameters, the query provides context and guides Genie in generating the SQL statements to process other questions.

Define a trusted asset

Defining a trusted asset starts with identifying a likely question. For example, suppose you’re working with a sales pipeline dataset, and a common question that a sales manager might ask is, “What are the open sales opportunities in my pipeline?”

Example: Use a UDF

The following steps outline the steps to create a trusted asset, as a UDF, that answers this question:

  1. Define and test a SQL query that answers the question.

    This query joins two tables and returns a dataset of open opportunities listed in the “Pipeline” forecast category. In this step, the goal is to write a basic query that returns the expected results.

    SELECT
      o.id AS `OppId`,
      a.region__c AS `Region`,
      o.name AS `Opportunity Name`,
      o.forecastcategory AS `Forecast Category`,
      o.stagename,
      o.closedate AS `Close Date`,
      o.amount AS `Opp Amount`
    FROM
    users.user_name.opportunity o
    JOIN catalog.schema.accounts a ON o.accountid = a.id
    WHERE
    o.forecastcategory = 'Pipeline'
    AND o.stagename NOT LIKE '%closed%';
    
  2. Define a Unity Catalog function.

    Your Unity Catalog function should parameterize the query and produce results matching the specific conditions you expect the user to ask about. Suppose the sales manager wants to narrow the result set by focusing on a particular region or group of regions.

    The following example defines a Unity Catalog function that takes a list of regions as a parameter and returns a table. The function return is almost identical to the SQL statement in the previous step, except the WHERE clause has been modified to filter the results by region if a region has been provided. The comments provided in the function definitions are critical for instructing the Genie space on when and how to invoke this function.

    • Parameter comments: The open_opps_in_region function expects an array of strings as a parameter. The comment includes an example of the expected input. If no parameter is supplied, the default value is NULL. See Tips for writing functions for more on including optional parameters and comments.

    • Function comments: The comment in the SQL table function provides a detailed explanation of what the function does. This is critical because it informs Genie when to use the function as a response to user questions. The comment should describe the function’s purpose as precisely as possible. This information guides Genie in recognizing the function’s relevance to specific questions.

    CREATE
    OR REPLACE FUNCTION users.user_name.open_opps_in_region (
      regions ARRAY < STRING >
      COMMENT 'List of regions.  Example: ["APAC", "EMEA"]' DEFAULT NULL
    ) RETURNS TABLE
    COMMENT 'Addresses questions about the pipeline in the specified regions by returning
     a list of all the open opportunities. If no region is specified, returns all open opportunities.
     Example questions: "What is the pipeline for APAC and EMEA?", "Open opportunities in
     APAC"'
    RETURN
      SELECT
      o.id AS `OppId`,
      a.region__c AS `Region`,
      o.name AS `Opportunity Name`,
      o.forecastcategory AS `Forecast Category`,
      o.stagename,
      o.closedate AS `Close Date`,
      o.amount AS `Opp Amount`
      FROM
      catalog.schema.opportunity o
      JOIN catalog.schema.accounts a ON o.accountid = a.id
      WHERE
      o.forecastcategory = 'Pipeline'
      AND o.stagename NOT LIKE '%closed%'
      AND (
        isnull(open_opps_in_region.regions)
        OR array_contains(open_opps_in_region.regions, region__c)
      );
    

    When you run the code to create a function, it’s registered to the currently active schema by default. See Custom SQL functions in Unity Catalog. See Create a SQL table function for syntax and examples.

  3. Add a trusted asset.

    After the function is created in Unity Catalog, a user with at least CAN EDIT permission on the Genie space can add it from the Instructions tab of the Genie space.

Required permissions

Users with at least CAN EDIT permission on a Genie space can add or remove trusted assets.

Genie space users must have CAN USE permission on the catalog and schema that contains the function. To invoke a trusted asset, they must have EXECUTE permission on the function in Unity Catalog. Unity Catalog securable objects inherit permissions from their parent containers. See Securable objects in Unity Catalog.

To simplify sharing in a Genie space, Databricks recommends creating a dedicated schema to contain all the functions you want to use in your Genie space.

Tips for writing functions

Review the following examples to learn how to create dynamic functions for trusted assets.

Include a default parameter value

You can specify a default value for a parameter. Use the DEFAULT clause in the function signature as shown in the following example:

countries ARRAY<STRING> COMMENT 'List of countries' DEFAULT ARRAY()

Include example parameter values

For columns with a set enumeration of values, increase accuracy by defining them clearly in the comment. The following example provides a sample list of values:

regions ARRAY < STRING > COMMENT 'List of regions. Values: ["AF","AN","AS", "EU", "NA", "OC", "SA", NULL]'

Create an optional parameter

To create an optional parameter, set the default parameter to NULL as shown in the following example:

min_date STRING DEFAULT NULL

Specify formatting with comments

You can specify an exact format for a parameter by including it in a comment, as shown in the following example:

min_date STRING COMMENT 'minimum date (included) for a transaction, in `yyyy-mm-dd` format'

Explictly check for NULL values

If you include an optional parameter, one possible value you should expect is NULL. Because comparison with NULL can yield unpredictable results, you should explicitly build a check for NULL values into your function. The following example provides example syntax:

WHERE (isnull(min_date) OR created_date >= min_date)