Skip to main content

What is Lakehouse Monitoring for generative AI?

Beta

This feature is in Beta.

This page describes how to monitor generative AI apps using Lakehouse Monitoring for GenAI. Lakehouse Monitoring is tightly intergated with Agent Evaluation so you can use the same evaluation configuration (LLM judges and custom metrics) in offline evaluation and online monitoring.

You can monitor gen AI apps deployed using Mosaic AI Agent Framework or those deployed outside of Databricks.

Lakehouse Monitoring for gen AI helps you track operational metrics like volume, latency, errors, and cost, as well as quality metrics like correctness and guideline adherence, using Mosaic AI Agent Evaluation AI judges.

Lakehouse Monitoring for gen AI UI Hero

Product overview

Lakehouse Monitoring for GenAI uses MLflow Tracing, an open standard for GenAI observability based on Open Telemetry, to instrument and capture production logs from your GenAI app. To use monitoring, first instrument your GenAI app with MLflow Tracing.

Monitoring is designed to:

  1. Help you identify quality and performance (cost, latency) issues in your production agent
    • Automatically run LLM judges to assess the quality of your production agent
    • View a dashboard with metrics about the quality of your production agent
    • Review individual traces (e.g., user requests)
  2. Transfer the underperforming traces to your development loop to iteratively test fixes to the identified issues
    • Add individual traces to an Evaluation Dataset to use with Agent Evaluation
    • Send individual traces to the Review App to collect ground truth labels from subject matter experts

The below diagram illustrates the workflow enabled by monitoring.

Monitoring workflow

note

This workflow is also applicable to pre-production apps that are used by beta testers.

Requirements

To monitor apps deployed using Mosiac AI Agent Framework:

  • Serverless jobs must be enabled.
  • To use LLM Judge metrics, Partner-powered AI assistive features to be enabled. Other metrics, like latency, are supported regardless of this setting.

Limitations

important
  • Online monitoring is currently in Beta. Only certain workspaces can use Beta products.
  • The following features are currently not available in the public Beta release:
    • User feedback logging
    • Custom metrics

If you need to use these features OR your workspace is not currently activated for the Monitoring Beta, please contact your Databricks account representative for access.

Set up monitoring

Agent monitoring supports agents deployed using Mosaic AI Agent Framework and gen AI apps deployed outside Databricks. The steps that you follow depend on the type of app you need to monitor. For details, see the following: