Use trusted assets in AI/BI Genie spaces
Preview
This feature is in Public Preview. See Enable Genie spaces in your workspace.
This article defines trusted assets and explains how to use them to provide verified answers in a Genie space.
What are trusted assets?
Trusted assets are predefined functions and example queries meant to provide verified answers to questions that you anticipate from users. When a user submits a question that invokes a trusted asset, it’s indicated in the response, adding an extra layer of assurance to the accuracy of the results.
Trusted assets can include the following:
Parameterized example SQL queries: When a parameterized example SQL query is used to generate a response, the reponse is labeled a trusted asset. The response includes the values used as arguments in the query.
User-defined table functions (UDFs): You can define custom functions and register them with Unity Catalog. Then, you can add those functions as trusted assets when you’re setting up instructions in your Genie space. See Create a SQL table function and User-defined functions (UDFs) in Unity Catalog.
Note
Trusted assets are not a substitute for all other instructions. Databricks recommends using trusted assets for well-established recurring questions. They provide exact answers to specific questions.
Why create trusted assets?
When using any AI tool, users should evaluate the accuracy of generated responses. Typically, they do this by considering whether the answer makes sense and effectively addresses their question. With Genie, a response is delivered as a table of results. Users can review the generated SQL that creates the result set, but non-technical users might not have the background to interpret the SQL statement or assess the correctness of the answer. Trusted assets help reduce the likelihood of these users encountering responses that are misleading, incorrect, or difficult to interpret.
When a user receives a response that is labeled as a trusted asset, they can be confident that a domain expert has reviewed the SQL statement populating the results.
What’s the difference between trusted assets and example SQL queries?
Trusted assets provide verified answers to questions you expect Genie space users to ask. When a trusted asset can answer a user’s question, the instruction you have stored as a trusted asset runs and returns the specified result set. All SQL Functions that you include in the instructions are treated as trusted assets. Example SQL Queries that include parameters can also be treated as trusted assets.
SQL Functions: You can write custom SQL functions tailored to handle your data and address company-specific questions. Genie does not consider the SQL content of your trusted assets when responding to questions.
Example SQL Queries (with parameters): When a parameterized example query’s exact text is used to generate a response, the response is automatically labeled as a trusted asset. If the exact text of the query is not used, or the example query does not use parameters, the query provides context and guides Genie in generating the SQL statements to process other questions.
Define a trusted asset
Defining a trusted asset starts with identifying a likely question. For example, suppose you’re working with a sales pipeline dataset, and a common question that a sales manager might ask is, “What are the open sales opportunities in my pipeline?”
Example: Use a UDF
The following steps outline the steps to create a trusted asset, as a UDF, that answers this question:
Define and test a SQL query that answers the question.
This query joins two tables and returns a dataset of open opportunities listed in the “Pipeline” forecast category. In this step, the goal is to write a basic query that returns the expected results.
SELECT o.id AS `OppId`, a.region__c AS `Region`, o.name AS `Opportunity Name`, o.forecastcategory AS `Forecast Category`, o.stagename, o.closedate AS `Close Date`, o.amount AS `Opp Amount` FROM users.user_name.opportunity o JOIN catalog.schema.accounts a ON o.accountid = a.id WHERE o.forecastcategory = 'Pipeline' AND o.stagename NOT LIKE '%closed%';
Define a Unity Catalog function.
Your Unity Catalog function should parameterize the query and produce results matching the specific conditions you expect the user to ask about. Suppose the sales manager wants to narrow the result set by focusing on a particular region or group of regions.
The following example defines a Unity Catalog function that takes a list of regions as a parameter and returns a table. The function return is almost identical to the SQL statement in the previous step, except the
WHERE
clause has been modified to filter the results by region if a region has been provided. The comments provided in the function definitions are critical for instructing the Genie space on when and how to invoke this function.Parameter comments: The
open_opps_in_region
function expects an array of strings as a parameter. The comment includes an example of the expected input. If no parameter is supplied, the default value isNULL
. See Tips for writing functions for more on including optional parameters and comments.Function comments: The comment in the SQL table function provides a detailed explanation of what the function does. This is critical because it informs Genie when to use the function as a response to user questions. The comment should describe the function’s purpose as precisely as possible. This information guides Genie in recognizing the function’s relevance to specific questions.
CREATE OR REPLACE FUNCTION users.user_name.open_opps_in_region ( regions ARRAY < STRING > COMMENT 'List of regions. Example: ["APAC", "EMEA"]' DEFAULT NULL ) RETURNS TABLE COMMENT 'Addresses questions about the pipeline in the specified regions by returning a list of all the open opportunities. If no region is specified, returns all open opportunities. Example questions: "What is the pipeline for APAC and EMEA?", "Open opportunities in APAC"' RETURN SELECT o.id AS `OppId`, a.region__c AS `Region`, o.name AS `Opportunity Name`, o.forecastcategory AS `Forecast Category`, o.stagename, o.closedate AS `Close Date`, o.amount AS `Opp Amount` FROM catalog.schema.opportunity o JOIN catalog.schema.accounts a ON o.accountid = a.id WHERE o.forecastcategory = 'Pipeline' AND o.stagename NOT LIKE '%closed%' AND ( isnull(open_opps_in_region.regions) OR array_contains(open_opps_in_region.regions, region__c) );
When you run the code to create a function, it’s registered to the currently active schema by default. See Custom SQL functions in Unity Catalog. See Create a SQL table function for syntax and examples.
Add a trusted asset.
After the function is created in Unity Catalog, a user with at least CAN EDIT permission on the Genie space can add it from the Instructions tab of the Genie space.
Required permissions
Users with at least CAN EDIT permission on a Genie space can add or remove trusted assets.
Genie space users must have CAN USE
permission on the catalog and schema that contains the function. To invoke a trusted asset, they must have EXECUTE
permission on the function in Unity Catalog. Unity Catalog securable objects inherit permissions from their parent containers. See Securable objects in Unity Catalog.
To simplify sharing in a Genie space, Databricks recommends creating a dedicated schema to contain all the functions you want to use in your Genie space.
Tips for writing functions
Review the following examples to learn how to create dynamic functions for trusted assets.
Include a default parameter value
You can specify a default value for a parameter. Use the DEFAULT
clause in the function signature as shown in the following example:
countries ARRAY<STRING> COMMENT 'List of countries' DEFAULT ARRAY()
Include example parameter values
For columns with a set enumeration of values, increase accuracy by defining them clearly in the comment. The following example provides a sample list of values:
regions ARRAY < STRING > COMMENT 'List of regions. Values: ["AF","AN","AS", "EU", "NA", "OC", "SA", NULL]'
Create an optional parameter
To create an optional parameter, set the default parameter to NULL
as shown in the following example:
min_date STRING DEFAULT NULL
Specify formatting with comments
You can specify an exact format for a parameter by including it in a comment, as shown in the following example:
min_date STRING COMMENT 'minimum date (included) for a transaction, in `yyyy-mm-dd` format'
Explictly check for NULL
values
If you include an optional parameter, one possible value you should expect is NULL
. Because comparison with NULL
can yield unpredictable results, you should explicitly build a check for NULL
values into your function. The following example provides example syntax:
WHERE (isnull(min_date) OR created_date >= min_date)