Infrastructure setup

Preview

This feature is in Private Preview. To try it, reach out to your Databricks contact.

Looking for a different RAG Studio doc? Go to the RAG documentation index

This document walks you through configuring the required infrastructure to create a RAG Studio application:

  1. Databricks Workspace

  2. <uc> schema

  3. Vector Search endpoint

  4. Personal Access Token saved in Secrets manager

  5. Generative AI models

  6. Cluster configurations

You will need these values when creating your RAG Application, so we suggest using a scratch pad, such as the one below, to write down these values as you walk through the steps below. These values will be requested from you when you initialize your application.

vector_search_endpoint_name:
unity_catalog_catalog_name:
unity_catalog_schema_name:
secret_scope:
secret_name:
model_serving_endpoint_chat: databricks-llama-2-70b-chat
model_serving_endpoint_embeddings: databricks-bge-large-en

Databricks workspace

Select a Databricks workspace with Unity Catalog and serverless enabled in a supported region. Note the URL of the workspace to use when configuring the application e.g., https://workspace-name.cloud.databricks.com.

Unity Catalog schema

RAG Studio creates all assets within a Unity Catalog schema.

  1. Create a new catalog and/or new schema or select an existing catalog / schema.

  2. Assign Data Editor permissions for your Databricks account to the catalog / schema using SQL or the Catalog Explorer

    Note

    If you created a new catalog/schema, you already have the necessary permissions.

    GRANT
        USE SCHEMA,
        APPLY TAG,
        MODIFY,
        READ VOLUME,
        REFRESH,
        SELECT,
        WRITE VOLUME,
        CREATE FUNCTION,
        CREATE MATERIALIZED VIEW,
        CREATE MODEL,
        CREATE TABLE,
        CREATE VOLUME
    ON SCHEMA my_schema
    TO `user@domain.com`;
    
    data_editor

Vector Search Endpoint

Create a new endpoint using the UI or Python SDK or select an existing endpoint.

Personal Access Token saved in Secrets manager

Warning

This approach is a temporary workaround to enable your app’s chain, which is hosted on Model Serving to access to the vector search indexes created by RAG Studio. In the future, this will not be needed.

  1. Create a personal access token (PAT) that has access to the Unity Catalog schema you created above.

    • Option 1: Create a PAT token for your user account by following these steps. .. note :: Using a PAT token is only suggested for development. Using a service principal is strongly recommended for production.

    • If you need to use a service principal, reach out to the RAG Studio team at rag-feedback@databricks.com.

  1. Save the PAT to a secret scope

    Note

    These steps assume you have followed the Development environment to install the Databricks CLI. For detailed instructions, refer to the Secret management documentation.

    databricks secrets create-scope <scope-name>
    databricks secrets put-secret <scope-name> <secret-name>
    

Generative AI models

RAG Studio natively integrates with Databricks Model Serving for access to Foundational Models. This integration is used for RAG Studio’s 🤖 LLM Judge and within your 🔗 Chain and 🗃️ Data Processor.

You need access to 2 types of models:

  1. Chat model following llm/v1/chat schema

  2. Embeddings model following llm/v1/embeddings

Note

No additional set up is required to use LLaMa2-70B-Chat and BGE-Large-EN use Open Source models hosted by Databricks Foundation Model APIs with pay-per-token.

Optionally, you can also configure: