Use value sampling to improve Genie's accuracy
This feature is in Public Preview.
Value sampling helps Genie generate more accurate SQL by collecting and using real data values from your tables. It has two components:
- Example values: Small samples from each column that help Genie understand data type and formatting.
- Value dictionaries: Curated lists of the most relevant values in a column, used to match user prompts to actual data.
Overview
When a user asks a question in Genie, the phrasing is often conversational and can include errors such as misspellings. In these cases, the values in the prompt might not match the structure or values in the data. This can cause Genie to misinterpret the question and generate incorrect SQL.
For example, a user might ask:
"Show me car sales in Florida for Q1."
If the data uses state abbreviations (such as FL
), and Genie cannot access the values for that column, Genie might generate SQL that includes ILIKE '%Florida%'
, which returns no results.
Enabling value sampling on the state
column allows Genie to access representative values. With this context, Genie can recognize that FL
corresponds to “Florida” and generate more accurate SQL.
Without value dictionary | With value dictionary |
---|---|
|
|
Value sampling helps Genie return correct results by improving its ability to generate accurate SQL.
Requirements
- Genie spaces must be enabled. See Manage Genie access.
- The Genie Data Sampling preview setting is enabled by default. If necessary, a workspace admin can manage access to the preview from the Previews page. The preview must be set to On for Genie space authors to use example values and value dictionaries.
How value sampling works
Genie automatically stores example values and creates value dictionaries for eligible columns as you add tables to the space. Tables with row filters or column masks are excluded. The column list view shows tags to indicate which columns include Example values or Value dictionaries .
- Example values are collected for all eligible columns and help Genie understand data type and formatting
- Value dictionaries are created for up to 60 columns, which should focus on those where users are likely to reference specific values, such as states and product categories. Each dictionary can include up to 1,024 distinct values that are less than 127 characters in length. If the space limit for value dictionaries is reached and you want to adjust which columns are included, you can manually select the columns. For instructions, see Manage value dictionaries. Value dictionaries are stored in your workspace's storage bucket.
Manage example values
If value sampling is enabled for your workspace, example values are automatically added when you select tables as you create a new space.
To turn off example values for a column:
- Click Configure > Data in your Genie space.
- Click a table name to view its columns.
- Click the
edit icon next to the column name.
- Click Advanced.
- Turn Example values off.
This action automatically disables building a value dictionary for that column. If necessary, use this setting to turn Example values back on.
Manage value dictionaries
Genie generates responses using your prompt, relevant table metadata, sampled values, error signals, and any input code or queries. When a column has an associated value dictionary, Genie leverages the stored values to interpret user prompts better and produce more accurate SQL queries. Value dictionaries significantly improve Genie's accuracy, especially when combined with clear example queries and well-crafted instructions. See Curate an effective Genie space for more guidance.
When selecting columns for value dictionaries, choose string columns that provide helpful context for interpreting prompts. Columns with categorical or consistently formatted values, such as states or product categories, typically work best. Avoid free-text or unstructured columns like user IDs, names, or reviews, as these often lack meaningful context and can reduce accuracy.
To set which string columns include a value dictionary:
- Click Configure > Data in your Genie space.
- Click a table name to view its columns.
- Click the
edit icon next to the column name.
- Click Advanced.
- Turn Build value dictionary on.
- To disable value dictionaries for a column, turn Build value dictionaries off. See Refresh or remove values.
Refresh or remove values
Refreshing sample values updates a column's stored values. Refresh sample values if:
- New values have been added to the column.
- The format of existing values has changed.
To refresh a value dictionary, click the kebab menu in the column view, then Refresh sample values.