Deploy a RAG application to production


This feature is in Private Preview. To try it, reach out to your Databricks contact.

Looking for a different RAG Studio doc? Go to the RAG documentation index

RAG Studio includes multiple environments to help you manage the lifecycle of your application. Up until now, these tutorials have worked in the RAG Studio development and Reviewers Environment.

In this tutorial, you will deploy a version of your application to the End Users environment.. Read the Environments for more details about how and why environments work.

  1. If you did not already run this command in Initialize a RAG Application, run the following command to initialize these Environments. This command takes about 10 minutes to run.

    ./rag setup-prod-env


    See Infrastructure and Unity Catalog assets created by RAG Studio for details of what is created in your Workspace and Unity Catalog schema.

  2. Run the following command to deploy the version to the End Users Environment. This command takes about 10 minutes to run.

    ./rag deploy-chain -v 1 -e end_users
  3. In the console, you will see output similar to below. Open the URL in your web browser to open the 💬 Review UI. You can share this URL with your 🧠 Expert Users.

    ...truncated for clarity of docs...
    Task deploy_chain_task:
    Your Review UI is now available. Open the Review UI here: https://<workspace-url>/ml/review/model/catalog.schema.rag_studio_databricks-docs-bot/version/1/environment/end_users
  4. If you want 👤 End Users to use the 💬 Review UI, add permissions to the deployed version.

    • Give the Databricks user you wish to grant access read permissions to

      • the MLflow Experiment

      • the Model Serving endpoint

      • the Unity Catalog Model


    🚧 Roadmap 🚧 Support for adding any corporate SSO to access the 💬 Review UI e.g., no requirements for a Databricks account.

  5. Now, every time a 👤 End Users chats with your RAG Application, the 🗂️ Request Log and 👍 Assessment & Evaluation Results Log will be populated.

Data flow