Skip to main content

Use the Genie API to integrate Genie into your applications

Preview

This feature is in Public Preview.

This page explains how to use the Genie API to enable Genie capabilities in your own chatbot, agent, or application.

Overview

The Genie API provides two types of capabilities:

  • Conversation APIs: Enable natural language data querying in applications, chatbots, and AI agent frameworks. These APIs support stateful conversations where users can ask follow-up questions and explore data naturally over time.
  • Management APIs: Enable programmatic creation, configuration, and deployment of Genie spaces across workspaces. Use these APIs for CI/CD pipelines, version control, and automated space management.

This page describes both conversation and management APIs. Before calling the conversation APIs, prepare a well-curated Genie space. The space provides the context that Genie uses to interpret questions and generate answers. If the space is incomplete or untested, users might still receive incorrect results even with a correct API integration. This guide explains the minimum setup needed to create a space that works effectively with the Genie API.

Prerequisites

To use the Genie API, you must have:

  • Access to a Databricks workspace with the Databricks SQL entitlement.
  • At least CAN USE privileges on a SQL pro or serverless SQL warehouse.

Getting started

Configure Databricks authentication

For production use cases where a user with access to a browser is present, use OAuth for users (OAuth U2M). In situations where browser-based authentication is not possible, use a service principal to authenticate with the API. See OAuth for service principals (OAuth M2M). Service principals must have permissions to access the required data and SQL warehouses.

Gather details

  • Workspace instance name: Find and copy your workspace instance name from your Databricks workspace URL. For details about the workspace identifiers in your URL, see Get identifiers for workspace objects.

    Example: https://cust-success.cloud.databricks.com/

  • Warehouse ID: You need the ID of a SQL warehouse that you have at least CAN USE privileges on. To find your warehouse ID:

    1. Go to SQL Warehouses in your workspace.
    2. Select the warehouse you want to use.
    3. Copy the warehouse ID from the URL or the warehouse details page.

    Alternatively, use the List warehouses endpoint GET /api/2.0/sql/warehouses to programmatically retrieve a list of all SQL warehouses that you have permissions to access. The response includes the warehouse ID.

Create or select a Genie space

A well-structured Genie space has the following characteristics:

  • Uses well-annotated data: Genie relies on table metadata and column comments. Verify that your Unity Catalog data sources have clear, descriptive comments.
  • Is user tested: Test your space by asking questions you expect from end users. Use testing to create and refine example SQL queries.
  • Includes company-specific context: Add instructions, example SQL, and functions. See Add SQL examples and instructions. Aim for at least five tested example SQL queries.
  • Uses benchmarks to test accuracy: Add at least five benchmark questions based on anticipated user questions. See Use benchmarks in a Genie space.

For more information on creating a space, see Set up and manage an AI/BI Genie space and Curate an effective Genie space.

You can either create a new Genie space or use an existing one:

Create a Genie space programmatically using the Create Genie space API. The following example demonstrates a well-structured space that follows best practices. Replace the placeholders with your values:

POST /api/2.0/genie/spaces
Host: <DATABRICKS_INSTANCE>
Authorization: Bearer <your_authentication_token>
{
"description": "Space for analyzing sales performance and trends",
"parent_path": "/Workspace/Users/<username>",
"serialized_space": "{\"version\":1,\"config\":{\"sample_questions\":[{\"id\":\"a1b2c3d4e5f6\",\"question\":[\"What were total sales last month?\"]},{\"id\":\"b2c3d4e5f6g7\",\"question\":[\"Show top 10 customers by revenue\"]},{\"id\":\"c3d4e5f6g7h8\",\"question\":[\"Compare sales by region for Q1 vs Q2\"]}]},\"data_sources\":{\"tables\":[{\"identifier\":\"sales.analytics.orders\",\"description\":[\"Transactional order data including order date, amount, and customer information\"],\"column_configs\":[{\"column_name\":\"order_date\",\"get_example_values\":true},{\"column_name\":\"status\",\"get_example_values\":true,\"build_value_dictionary\":true},{\"column_name\":\"region\",\"get_example_values\":true,\"build_value_dictionary\":true}]},{\"identifier\":\"sales.analytics.customers\"},{\"identifier\":\"sales.analytics.products\"}]},\"instructions\":{\"text_instructions\":[{\"id\":\"01f0b37c378e1c91\",\"content\":[\"When calculating revenue, sum the order_amount column. When asked about 'last month', use the previous calendar month (not the last 30 days). Round all monetary values to 2 decimal places.\"]}],\"example_question_sqls\":[{\"id\":\"01f0821116d912db\",\"question\":[\"Show top 10 customers by revenue\"],\"sql\":[\"SELECT customer_name, SUM(order_amount) as total_revenue\\n\",\"FROM sales.analytics.orders o\\n\",\"JOIN sales.analytics.customers c ON o.customer_id = c.customer_id\\n\",\"GROUP BY customer_name\\n\",\"ORDER BY total_revenue DESC\\n\",\"LIMIT 10\"]},{\"id\":\"01f099751a3a1df3\",\"question\":[\"What were total sales last month\"],\"sql\":[\"SELECT SUM(order_amount) as total_sales\\n\",\"FROM sales.analytics.orders\\n\",\"WHERE order_date >= DATE_TRUNC('month', CURRENT_DATE - INTERVAL 1 MONTH)\\n\",\"AND order_date < DATE_TRUNC('month', CURRENT_DATE)\"]}],\"join_specs\":[{\"id\":\"01f0c0b4e8151\",\"left\":{\"identifier\":\"sales.analytics.orders\",\"alias\":\"orders\"},\"right\":{\"identifier\":\"sales.analytics.customers\",\"alias\":\"customers\"},\"sql\":[\"orders.customer_id = customers.customer_id\"]}],\"sql_snippets\":{\"filters\":[{\"id\":\"01f09972e66d1\",\"sql\":[\"orders.order_amount > 1000\"],\"display_name\":\"high value orders\",\"synonyms\":[\"large orders\",\"big purchases\"]}],\"expressions\":[{\"id\":\"01f09974563a1\",\"alias\":\"order_year\",\"sql\":[\"YEAR(orders.order_date)\"],\"display_name\":\"year\"}],\"measures\":[{\"id\":\"01f09972611f1\",\"alias\":\"total_revenue\",\"sql\":[\"SUM(orders.order_amount)\"],\"display_name\":\"total revenue\",\"synonyms\":[\"revenue\",\"total sales\"]}]}}}",
"title": "Sales Analytics Space",
"warehouse_id": "<warehouse-id>"
}

Response:
{
"space_id": "3c409c00b54a44c79f79da06b82460e2",
"title": "Sales Analytics Space",
"description": "Space for analyzing sales performance and trends",
"warehouse_id": "<warehouse-id>",
"serialized_space": "{\n \"version\": 1,\n \"config\": {\n \"sample_questions\": [\n {\n \"id\": \"a1b2c3d4e5f600000000000000000000\",\n \"question\": [\n \"What were total sales last month?\"\n ]\n },\n {\n \"id\": \"b2c3d4e5f6g700000000000000000000\",\n \"question\": [\n \"Show top 10 customers by revenue\"\n ]\n },\n {\n \"id\": \"c3d4e5f6g7h800000000000000000000\",\n \"question\": [\n \"Compare sales by region for Q1 vs Q2\"\n ]\n }\n ]\n },\n \"data_sources\": {\n \"tables\": [\n {\n \"identifier\": \"sales.analytics.orders\",\n \"description\": [\n \"Transactional order data including order date, amount, and customer information\"\n ],\n \"column_configs\": [\n {\n \"column_name\": \"order_date\",\n \"get_example_values\": true\n },\n {\n \"column_name\": \"status\",\n \"get_example_values\": true,\n \"build_value_dictionary\": true\n },\n {\n \"column_name\": \"region\",\n \"get_example_values\": true,\n \"build_value_dictionary\": true\n }\n ]\n },\n {\n \"identifier\": \"sales.analytics.customers\"\n },\n {\n \"identifier\": \"sales.analytics.products\"\n }\n ]\n },\n \"instructions\": {\n \"text_instructions\": [\n {\n \"id\": \"01f0b37c378e1c91\",\n \"content\": [\n \"When calculating revenue, sum the order_amount column. When asked about 'last month', use the previous calendar month (not the last 30 days). Round all monetary values to 2 decimal places.\"\n ]\n }\n ],\n \"example_question_sqls\": [\n {\n \"id\": \"01f0821116d912db\",\n \"question\": [\n \"Show top 10 customers by revenue\"\n ],\n \"sql\": [\n \"SELECT customer_name, SUM(order_amount) as total_revenue\\n\",\n \"FROM sales.analytics.orders o\\n\",\n \"JOIN sales.analytics.customers c ON o.customer_id = c.customer_id\\n\",\n \"GROUP BY customer_name\\n\",\n \"ORDER BY total_revenue DESC\\n\",\n \"LIMIT 10\"\n ]\n },\n {\n \"id\": \"01f099751a3a1df3\",\n \"question\": [\n \"What were total sales last month\"\n ],\n \"sql\": [\n \"SELECT SUM(order_amount) as total_sales\\n\",\n \"FROM sales.analytics.orders\\n\",\n \"WHERE order_date >= DATE_TRUNC('month', CURRENT_DATE - INTERVAL 1 MONTH)\\n\",\n \"AND order_date < DATE_TRUNC('month', CURRENT_DATE)\"\n ]\n }\n ],\n \"join_specs\": [\n {\n \"id\": \"01f0c0b4e8151\",\n \"left\": {\n \"identifier\": \"sales.analytics.orders\",\n \"alias\": \"orders\"\n },\n \"right\": {\n \"identifier\": \"sales.analytics.customers\",\n \"alias\": \"customers\"\n },\n \"sql\": [\n \"orders.customer_id = customers.customer_id\"\n ]\n }\n ],\n \"sql_snippets\": {\n \"filters\": [\n {\n \"id\": \"01f09972e66d1\",\n \"sql\": [\"orders.order_amount > 1000\"],\n \"display_name\": \"high value orders\",\n \"synonyms\": [\"large orders\", \"big purchases\"]\n }\n ],\n \"expressions\": [\n {\n \"id\": \"01f09974563a1\",\n \"alias\": \"order_year\",\n \"sql\": [\"YEAR(orders.order_date)\"],\n \"display_name\": \"year\"\n }\n ],\n \"measures\": [\n {\n \"id\": \"01f09972611f1\",\n \"alias\": \"total_revenue\",\n \"sql\": [\"SUM(orders.order_amount)\"],\n \"display_name\": \"total revenue\",\n \"synonyms\": [\"revenue\", \"total sales\"]\n }\n ]\n }\n }\n}\n"
}

Understanding the serialized_space field

The serialized_space field is a JSON string that defines the configuration and data sources for your Genie space. In the API request, this JSON must be escaped as a string. The field contains:

  • version: Schema version number for backwards compatibility. Use 2 as shown in the example below.
  • config: Space configuration including:
    • sample_questions: Example questions to guide users. Each question requires an id (32-character hex string) and question (array of strings).
  • data_sources: Data sources available to the space:
    • tables: Array of table objects with identifier (three-level namespace), optional description, and optional column_configs.
    • metric_views: Array of metric view objects (same structure as tables).
  • instructions: Structured instructions for the space:
    • text_instructions: High-level guidance for the LLM.
    • example_question_sqls: Example questions with SQL answers, optionally with parameters and usage_guidance.
    • sql_functions: References to SQL functions available to the space.
    • join_specs: Pre-defined join relationships between tables.
    • sql_snippets: Reusable filters, expressions, and measures.
  • benchmarks: Questions for evaluating space quality, each with a ground-truth SQL answer.

The unescaped version of the serialized_space field from the create space example looks like:

JSON
{
"version": 2,
"config": {
"sample_questions": [
{
"id": "a1b2c3d4e5f60000000000000000000a",
"question": ["What were total sales last month?"]
},
{
"id": "b2c3d4e5f6a70000000000000000000b",
"question": ["Show top 10 customers by revenue"]
}
]
},
"data_sources": {
"tables": [
{
"identifier": "sales.analytics.customers",
"description": ["Customer master data including contact information and account details"],
"column_configs": [
{
"column_name": "customer_id",
"description": ["Unique identifier for each customer"],
"synonyms": ["cust_id", "account_id"]
},
{
"column_name": "customer_name",
"enable_entity_matching": true
},
{
"column_name": "internal_notes",
"exclude": true
}
]
},
{
"identifier": "sales.analytics.orders",
"description": ["Transactional order data including order date, amount, and customer information"],
"column_configs": [
{
"column_name": "order_date",
"enable_format_assistance": true
},
{
"column_name": "region",
"enable_format_assistance": true,
"enable_entity_matching": true
},
{
"column_name": "status",
"enable_format_assistance": true,
"enable_entity_matching": true
}
]
},
{
"identifier": "sales.analytics.products"
}
],
"metric_views": [
{
"identifier": "sales.analytics.revenue_metrics",
"description": ["Pre-aggregated revenue metrics by region and time period"],
"column_configs": [
{
"column_name": "period",
"description": ["Time period for the metric (monthly, quarterly, yearly)"],
"enable_format_assistance": true
}
]
}
]
},
"instructions": {
"text_instructions": [
{
"id": "01f0b37c378e1c9100000000000000a1",
"content": [
"When calculating revenue, sum the order_amount column. ",
"When asked about 'last month', use the previous calendar month. ",
"Round all monetary values to 2 decimal places."
]
}
],
"example_question_sqls": [
{
"id": "01f0821116d912db00000000000000b1",
"question": ["Show top 10 customers by revenue"],
"sql": [
"SELECT customer_name, SUM(order_amount) as total_revenue\n",
"FROM sales.analytics.orders o\n",
"JOIN sales.analytics.customers c ON o.customer_id = c.customer_id\n",
"GROUP BY customer_name\n",
"ORDER BY total_revenue DESC\n",
"LIMIT 10"
]
},
{
"id": "01f099751a3a1df300000000000000b2",
"question": ["What were total sales last month"],
"sql": [
"SELECT SUM(order_amount) as total_sales\n",
"FROM sales.analytics.orders\n",
"WHERE order_date >= DATE_TRUNC('month', CURRENT_DATE - INTERVAL 1 MONTH)\n",
"AND order_date < DATE_TRUNC('month', CURRENT_DATE)"
]
},
{
"id": "01f099751a3a1df300000000000000b3",
"question": ["Show sales for a specific region"],
"sql": [
"SELECT SUM(order_amount) as total_sales\n",
"FROM sales.analytics.orders\n",
"WHERE region = :region_name"
],
"parameters": [
{
"name": "region_name",
"type_hint": "STRING",
"description": ["The region to filter by (e.g., 'North America', 'Europe')"],
"default_value": {
"values": ["North America"]
}
}
],
"usage_guidance": ["Use this example when the user asks about sales filtered by a specific geographic region"]
}
],
"sql_functions": [
{
"id": "01f0c0b4e815100000000000000000f1",
"identifier": "sales.analytics.fiscal_quarter"
}
],
"join_specs": [
{
"id": "01f0c0b4e815100000000000000000c1",
"left": {
"identifier": "sales.analytics.orders",
"alias": "orders"
},
"right": {
"identifier": "sales.analytics.customers",
"alias": "customers"
},
"sql": ["orders.customer_id = customers.customer_id"],
"comment": ["Join orders to customers on customer_id"],
"instruction": ["Use this join when you need customer details for order analysis"]
}
],
"sql_snippets": {
"filters": [
{
"id": "01f09972e66d100000000000000000d1",
"sql": ["orders.order_amount > 1000"],
"display_name": "high value orders",
"synonyms": ["large orders", "big purchases"],
"comment": ["Filters to orders over $1000"],
"instruction": ["Use when the user asks about high-value or large orders"]
}
],
"expressions": [
{
"id": "01f09974563a100000000000000000e1",
"alias": "order_year",
"sql": ["YEAR(orders.order_date)"],
"display_name": "year",
"synonyms": ["fiscal year", "calendar year"],
"comment": ["Extracts the year from order date"],
"instruction": ["Use for year-over-year analysis"]
}
],
"measures": [
{
"id": "01f09972611f100000000000000000f1",
"alias": "total_revenue",
"sql": ["SUM(orders.order_amount)"],
"display_name": "total revenue",
"synonyms": ["revenue", "total sales"],
"comment": ["Sum of all order amounts"],
"instruction": ["Use this measure for revenue calculations"]
}
]
}
},
"benchmarks": {
"questions": [
{
"id": "01f0d0b4e815100000000000000000g1",
"question": ["What is the average order value?"],
"answer": [
{
"format": "SQL",
"content": ["SELECT AVG(order_amount) as avg_order_value\n", "FROM sales.analytics.orders"]
}
]
}
]
}
}

When constructing your space, create this JSON structure and then escape it as a string for the API request. For complete schema details, see the Create Genie space API reference.

Validation rules for serialized_space

The serialized_space JSON must conform to the following validation rules. JSON that is not valid is rejected during space creation or update.

Version

  • Version field: Required. Use 2 for new spaces. The version number exists for backwards compatibility.

ID format

All ID fields must be 32-character lowercase hexadecimal strings (UUID format without hyphens).

  • Valid: a1b2c3d4e5f60000000000000000000a
  • Not valid: a1b2c3d4e5f6 (too short), A1B2C3D4E5F60000000000000000000A (uppercase), a1b2c3d4-e5f6-0000-0000-00000000000a (contains hyphens)

IDs are required for:

  • config.sample_questions[].id
  • instructions.text_instructions[].id
  • instructions.example_question_sqls[].id
  • instructions.join_specs[].id
  • instructions.sql_snippets.filters[].id
  • instructions.sql_snippets.expressions[].id
  • instructions.sql_snippets.measures[].id
  • benchmarks.questions[].id (if benchmarks are included)

Sorting requirements

Collections containing IDs or identifiers must be pre-sorted. The system validates that arrays are already sorted and rejects unsorted input.

Collection

Sort key

data_sources.tables

identifier (alphabetically)

data_sources.metric_views

identifier (alphabetically)

data_sources.tables[].column_configs

column_name (alphabetically)

data_sources.metric_views[].column_configs

column_name (alphabetically)

config.sample_questions

id (alphabetically)

instructions.text_instructions

id (alphabetically)

instructions.example_question_sqls

id (alphabetically)

instructions.sql_functions

(id, identifier) tuple (alphabetically)

instructions.join_specs

id (alphabetically)

instructions.sql_snippets.filters

id (alphabetically)

instructions.sql_snippets.expressions

id (alphabetically)

instructions.sql_snippets.measures

id (alphabetically)

benchmarks.questions

id (alphabetically)

Uniqueness constraints

  • Question IDs: All IDs in config.sample_questions and benchmarks.questions must be unique across both collections.
  • Instruction IDs: All IDs across text_instructions, example_question_sqls, sql_functions, join_specs, and all sql_snippets types must be unique.
  • Column configs: The combination of (table_identifier, column_name) must be unique within the space.

Size and length limits

  • String length: Individual string elements are limited to 25,000 characters.
  • Array size: Repeated fields are limited to 10,000 items.
  • Text instructions: At most 1 text instruction is allowed per space.
  • Tables and metric views: Subject to workspace-specific limits.
  • SQL content: Query text in sql and join_specs.sql fields is subject to length limits.

Other requirements

  • Table identifiers: Must use three-level namespace format (catalog.schema.table).
  • Benchmark answers: Each benchmark question must have exactly one answer with format set to SQL.
  • SQL snippets: Filter, expression, and measure SQL fields must not be empty.

Using the conversation API

After you configure a Genie space, use the conversation API endpoints to ask questions, retrieve results, and maintain multi-turn conversations with context.

Start a conversation

The Start conversation endpoint POST /api/2.0/genie/spaces/{space_id}/start-conversation starts a new conversation in your Genie space.

Replace the placeholders with your Databricks instance, Genie space ID, and authentication token. An example of a successful response follows the request. It includes details that you can use to access this conversation again for follow-up questions.

POST /api/2.0/genie/spaces/{space_id}/start-conversation

HOST= <DATABRICKS_INSTANCE>
Authorization: <your_authentication_token>
{
"content": "<your question>",
}


Response:

{
"conversation": {
"created_timestamp": 1719769718,
"id": "6a64adad2e664ee58de08488f986af3e",
"last_updated_timestamp": 1719769718,
"space_id": "3c409c00b54a44c79f79da06b82460e2",
"title": "Give me top sales for last month",
"user_id": 12345
},
"message": {
"attachments": null,
"content": "Give me top sales for last month",
"conversation_id": "6a64adad2e664ee58de08488f986af3e",
"created_timestamp": 1719769718,
"error": null,
"id": "e1ef34712a29169db030324fd0e1df5f",
"last_updated_timestamp": 1719769718,
"query_result": null,
"space_id": "3c409c00b54a44c79f79da06b82460e2",
"status": "IN_PROGRESS",
"user_id": 12345
}
}

Retrieve generated SQL

Use the conversation_id and message_id in the response to poll to check the message's generation status and retrieve the generated SQL from Genie. See GET /api/2.0/genie/spaces/{space_id}/conversations/{conversation_id}/messages/{message_id} for complete request and response details.

note

Only POST requests count toward the queries-per-minute throughput limit. GET requests used to poll results are not subject to this limit.

Substitute your values into the following request:

GET /api/2.0/genie/spaces/{space_id}/conversations/{conversation_id}/messages/{message_id}
HOST= <DATABRICKS_INSTANCE>
Authorization: Bearer <your_authentication_token>

The following example response reports the message details:

Response:

{
"attachments": null,
"content": "Give me top sales for last month",
"conversation_id": "6a64adad2e664ee58de08488f986af3e",
"created_timestamp": 1719769718,
"error": null,
"id": "e1ef34712a29169db030324fd0e1df5f",
"last_updated_timestamp": 1719769718,
"query_result": null,
"space_id": "3c409c00b54a44c79f79da06b82460e2",
"status": "IN_PROGRESS",
"user_id": 12345
}

When the status field is COMPLETED the response is populated in the attachments array.

Retrieve query results

The attachments array contains Genie's response. It includes the generated text response (text), the query statement if it exists (query), and an identifier that you can use to get the associated query results (attachment_id). Replace the placeholders in the following example to retrieve the generated query results:

GET /api/2.0/genie/spaces/{space_id}/conversations/{conversation_id}/messages/{message_id}/query-result/{attachment_id}
Authorization: Bearer <your_authentication_token>

See GET /api/2.0/genie/spaces/{space_id}/conversations/{conversation_id}/messages/{message_id}/attachments/{attachment_id}/query-result.

Ask follow-up questions

After you receive a response, use the conversation_id to continue the conversation. Context from previous messages is retained and used in follow-up responses. For complete request and response details, see POST /api/2.0/genie/spaces/{space_id}/conversations/{conversation_id}/messages.

POST /api/2.0/genie/spaces/{space_id}/conversations/{conversation_id}/messages
HOST= <DATABRICKS_INSTANCE>
Authorization: <your_authentication_token>
{
"content": "Which of these customers opened and forwarded the email?",
}

Retrieve space and conversation data

The Genie API provides additional endpoints for retrieving configuration and historical data from existing spaces and conversations.

Retrieve space configuration

When retrieving space information using the Get Genie Space API, you can include the serialized_space field in the response by setting the include_serialized_space parameter to true. The serialized_space field contains the serialized string representation of the Genie space, including instructions, benchmarks, joins, and other configuration details.

Use this serialized representation with the Create Genie Space API and Update Genie Space API to promote Genie spaces across workspaces or create backups of space configurations.

Example GET request:

GET /api/2.0/genie/spaces/{space_id}?include_serialized_space=true
Host: <DATABRICKS_INSTANCE>
Authorization: Bearer <your_authentication_token>

Response:
{
"space_id": "3c409c00b54a44c79f79da06b82460e2",
"title": "Sales Analytics Space",
"description": "Space for analyzing sales performance and trends",
"warehouse_id": "<warehouse-id>",
"created_timestamp": 1719769718,
"last_updated_timestamp": 1719769718,
"serialized_space": "{\"version\":1,\"config\":{\"sample_questions\":[{\"id\":\"a1b2c3d4e5f600000000000000000000\",\"question\":[\"What were total sales last month?\"]},{\"id\":\"b2c3d4e5f6g700000000000000000000\",\"question\":[\"Show top 10 customers by revenue\"]}]},\"data_sources\":{\"tables\":[{\"identifier\":\"sales.analytics.orders\",\"description\":[\"Transactional order data including order date, amount, and customer information\"],\"column_configs\":[{\"column_name\":\"order_date\",\"get_example_values\":true},{\"column_name\":\"status\",\"get_example_values\":true,\"build_value_dictionary\":true},{\"column_name\":\"region\",\"get_example_values\":true,\"build_value_dictionary\":true}]},{\"identifier\":\"sales.analytics.customers\"},{\"identifier\":\"sales.analytics.products\"}]},\"instructions\":{\"text_instructions\":[{\"id\":\"01f0b37c378e1c91\",\"content\":[\"When calculating revenue, sum the order_amount column. When asked about 'last month', use the previous calendar month (not the last 30 days). Round all monetary values to 2 decimal places.\"]}],\"example_question_sqls\":[{\"id\":\"01f0821116d912db\",\"question\":[\"Show top 10 customers by revenue\"],\"sql\":[\"SELECT customer_name, SUM(order_amount) as total_revenue\\n\",\"FROM sales.analytics.orders o\\n\",\"JOIN sales.analytics.customers c ON o.customer_id = c.customer_id\\n\",\"GROUP BY customer_name\\n\",\"ORDER BY total_revenue DESC\\n\",\"LIMIT 10\"]},{\"id\":\"01f099751a3a1df3\",\"question\":[\"What were total sales last month\"],\"sql\":[\"SELECT SUM(order_amount) as total_sales\\n\",\"FROM sales.analytics.orders\\n\",\"WHERE order_date >= DATE_TRUNC('month', CURRENT_DATE - INTERVAL 1 MONTH)\\n\",\"AND order_date < DATE_TRUNC('month', CURRENT_DATE)\"]}],\"join_specs\":[{\"id\":\"01f0c0b4e8151\",\"left\":{\"identifier\":\"sales.analytics.orders\",\"alias\":\"orders\"},\"right\":{\"identifier\":\"sales.analytics.customers\",\"alias\":\"customers\"},\"sql\":[\"orders.customer_id = customers.customer_id\"]}],\"sql_snippets\":{\"filters\":[{\"id\":\"01f09972e66d1\",\"sql\":[\"orders.order_amount > 1000\"],\"display_name\":\"high value orders\",\"synonyms\":[\"large orders\",\"big purchases\"]}],\"expressions\":[{\"id\":\"01f09974563a1\",\"alias\":\"order_year\",\"sql\":[\"YEAR(orders.order_date)\"],\"display_name\":\"year\"}],\"measures\":[{\"id\":\"01f09972611f1\",\"alias\":\"total_revenue\",\"sql\":[\"SUM(orders.order_amount)\"],\"display_name\":\"total revenue\",\"synonyms\":[\"revenue\",\"total sales\"]}]}}}"
}

Reference old conversation threads

To allow users to refer to old conversation threads, use the List conversation messages endpoint GET /api/2.0/genie/spaces/{space_id}/conversations/{conversation_id}/messages to retrieve all messages from a specific conversation thread.

Retrieve conversation data for analysis

Space managers can programmatically retrieve all previous messages asked across all users of a space for analysis. To retrieve this data:

  1. Use the GET /api/2.0/genie/spaces/{space_id}/conversations endpoint to get all existing conversation threads in a space.
  2. For each conversation ID returned, use the GET /api/2.0/genie/spaces/{space_id}/conversations endpoint to retrieve the list of messages for that conversation.

Best practices and limits

Best practices for using the Genie API

To maintain performance and reliability when using the Genie API:

  • Implement retry logic with exponential backoff: The API doesn't retry failed requests for you, so add your own queuing and exponential backoff. This helps your application handle transient failures, avoid unnecessary repeat requests, and stay within throughput limits as it grows.
  • Log API responses: Implement comprehensive logging of API requests and responses to help with debugging, monitoring usage patterns, and tracking costs.
  • Poll for status updates every 1 to 5 seconds: Continue polling until a conclusive message status, such as COMPLETED, FAILED, or CANCELLED, is received. Limit polling to 10 minutes for most queries. If there is no conclusive response after 10 minutes, stop polling and return a timeout error or prompt the user to manually check the query status later.
  • Use exponential backoff for polling: Increase the delay between polls up to a maximum of one minute. This reduces unnecessary requests for long-running queries while still allowing low latency for fast ones.
  • Start a new conversation for each session: Avoid reusing conversation threads across sessions, as this can reduce accuracy due to unintended context reuse.
  • Maintain conversation limits: To manage old conversations and stay under the 10,000 conversation limit:
    1. Use the GET /api/2.0/genie/spaces/{space_id}/conversations endpoint to see all existing conversation threads in a space.
    2. Identify conversations that are no longer needed, such as older conversations or test conversations.
    3. Use the DELETE /api/2.0/genie/spaces/{space_id}/conversations/{conversation_id} endpoint to remove conversations programmatically.
  • Be aware of query result limit: The Genie API returns a maximum of 5,000 rows per query result. If your data analysis requires more rows, consider refining your question to focus on a specific subset of data or use filters to narrow the results.

Throughput limit

During the Public Preview period, the throughput rates for the Genie API free tier are best-effort and depend on system capacity. Under normal or low-traffic conditions, the API limits requests to 5 queries per minute per workspace. During peak usage periods, the system processes requests based on available capacity, which can result in lower throughput.

Monitor the space

After your application is set up, you can monitor questions and responses in the Databricks UI.

Encourage users to test the space so that you learn about the types of questions they are likely to ask and the responses they receive. Provide users with guidance to help them start testing the space. Use the Monitoring tab to view questions and responses. See Monitor the space.

You can also use audit logs to monitor activity in a Genie space. See AI/BI Genie events.