🤖 LLM Judge

Preview

This feature is in Private Preview. To try it, reach out to your Databricks contact.

Looking for a different RAG Studio doc? Go to the RAG documentation index

Conceptual overview

🤖 LLM Judge provides LLM-judged feedback on your RAG Application. This enables you to gain additional insight into your application’s quality.

Configuring 🤖 LLM Judge

  1. Open the rag-config.yml in your IDE/code editor.

  2. Edit the global_config.evaluation.assessment_judges configuration.

    evaluation:
      # Configure the LLM judges for assessments
      assessment_judges:
        - judge_name: LLaMa2-70B-Chat
          endpoint_name: databricks-llama-2-70b-chat # Model Serving endpoint name
          assessments: # pre-defined list based on the names of metrics
            - harmful
            - answer_correct
            - faithful_to_context
            - relevant_to_question_and_context
    

    Tip

    🚧 Roadmap 🚧 Support for customer-defined 🤖 LLM Judge assessments.

  3. RAG Studio automatically computes 🤖 LLM Judge assessments for every invocation of your 🔗 Chain.

    Tip

    🚧 Roadmap 🚧 Configuration to adjust when the 🤖 LLM Judge is or isn’t run, including only sampling x% of responses.

Data flows

Online evaluation

online

Offline evaluation

offline