Cluster Configuration for RAG Studio

This article describes the clusters which RAG Studio provisions to automate tasks including data ingestion, RAG chain creation, and RAG evaluation.

By default, RAG Studio provisions new job clusters specifically for these tasks.

Default cluster provisioning

The default clusters provisioned by RAG Studio are:

  • Access Mode: Assigned

  • Databricks Runtime Version: 13.3 LTS ML

This setup is optimized for stability and performance.

Permissions requirement

To allow RAG Studio to provision these clusters automatically, ensure that your Databricks account has the necessary permissions to create job clusters with the above properties.

Use an existing interactive cluster

If you prefer to use an existing interactive cluster for RAG Studio tasks, you can configure this by specifying the cluster ID in your usage of rag, for example:

./rag create-rag-version -e dev --cluster-id <your-cluster-id>

To identify a cluster’s ID, see Cluster URL and ID.

Alternatively, you can specify a cluster ID in your rag-config.yml configuration file. This method is useful for setting a default cluster for all RAG Studio operations within a specific environment. Add the cluster_id field under the appropriate environment section, as shown below:

development:
  - name: dev
    ...
    cluster_id: <your_cluster_id>

Cluster override is only supported for the dev environment.