Debug & Observe Your App with Tracing
Once your GenAI application is instrumented with MLflow Tracing, you gain powerful tools to debug its behavior, understand its performance, and observe its inputs and outputs. This guide focuses on how to effectively use the MLflow UI within Databricks and leverage its notebook integration for these purposes.
Reviewing Traces in the Databricks MLflow UI
All captured traces are logged to an MLflow Experiment. You can access them through the MLflow UI in your Databricks workspace.
- Navigate to Your Experiment: Go to the experiment where your traces are logged (e.g., the one set by
mlflow.set_experiment("/Shared/my-genai-app-traces")
). - Open the "Traces" Tab: Within the experiment view, click on the "Traces" tab. This will display a list of all traces logged to that experiment.
Understanding the Trace List View
The trace list provides a high-level overview of your traces, with sortable columns that typically include:
- Trace ID: The unique identifier for each trace.
- Request: A preview of the initial input that triggered the trace.
- Response: A preview of the final output of the trace.
- Session: The session identifier, if provided, grouping related traces (e.g., in a conversation).
- User: The user identifier, if provided.
- Execution time: Total time taken for the trace to complete.
- Request time: The timestamp when the trace was initiated.
- Run name: If the trace is associated with an MLflow Run, its name will be displayed here, linking them.
- Source: The origin of the trace, often indicating the instrumented library or component (e.g.,
openai
,langchain
, or a custom trace name). - State: The current status of the trace (e.g.,
OK
,ERROR
,IN_PROGRESS
). - Trace name: The specific name assigned to this trace, often the root span's name.
- Assessments: Individual columns for each assessment type (e.g.,
my_scorer
,professional
). The UI also often displays a summary section above the list showing aggregated assessment metrics (like averages or pass/fail rates) across the currently visible traces. - Tags: Individual tags can be displayed as columns (e.g.,
persona
,style
). A summary count of tags might also be present.
Searching and Filtering Traces
The UI offers several ways to find and focus on relevant traces:
- Search bar (often labeled "Search evaluations by request" or similar): This allows you to quickly find traces by searching the content of their
Request
(input) field. - Filters Dropdown: For more structured filtering, use the "Filters" dropdown. This typically allows you to build queries based on:
- Attributes: Such as
Request
content,Session time
,Execution time
, orRequest time
. - Assessments: Filter by the presence or specific values of assessments like
my_scorer
orprofessional
. - Other fields like
State
,Trace name
,Session
,User
, andTags
(e.g.,tags.persona = 'expert'
).
- Attributes: Such as
- Sort Dropdown: Use the "Sort" dropdown to order traces by various columns like
Request time
,Execution time
, etc. - Columns Dropdown: Customize which columns are visible in the trace list, including specific tags or assessment metrics.
For advanced programmatic querying and understanding the filter syntax that the UI might use, refer to the Query traces via SDK guide.
Exploring an Individual Trace
To dive deep into a specific trace, click on its Request or Trace Name in the list. This opens the detailed trace view, which typically has a few main panels:
-
Trace breakdown (Left Panel):
- This panel (often titled "Trace breakdown") displays the span hierarchy as a tree or waterfall chart. It shows all operations (spans) within the trace, their parent-child relationships, and their execution order and duration.
- You can select individual spans from this breakdown to inspect their specific details.
-
Span Details (Center Panel):
-
When a span is selected from the Trace breakdown, this panel shows its detailed information, usually organized into tabs such as:
-
Chat: For LLM interactions that are chat-based, this tab often provides a rendered view of the conversation flow (user, assistant, tool messages).
-
Inputs / Outputs: Displays the raw input data passed to the operation and the raw output data it returned. For large content, a "See more" / "See less" toggle may be available to expand or collapse the view.
-
Attributes: Shows key-value metadata specific to the span (e.g.,
model
name,temperature
for an LLM call;doc_uri
for a retriever span). -
Events: For spans that encountered errors, this tab typically shows exception details and stack traces. For streaming spans, it may show individual data chunks as they were yielded.
-
Some output fields might also have a Markdown toggle to switch between raw and rendered views if the content is in Markdown format.
-
-
-
Assessments (Right Panel):
- This panel displays any assessments (user feedback or evaluations) that have been logged for the entire trace or for the currently selected span.
- Crucially, this panel often includes an "+ Add new assessment" button, allowing you to log new feedback or evaluation scores directly from the UI while reviewing a trace. This is very useful for manual review and labeling workflows.
Trace-Level Information: Beyond individual span details, the view also provides access to overall trace information. This includes trace-level tags and any assessments logged for the entire trace (often visible in the Assessments panel when no specific span or the root span is selected), which may originate from direct user feedback or systematic evaluations.
Common Debugging and Observation Scenarios
Here's how you can use the MLflow Tracing UI to address common debugging and observability needs:
-
Identifying Slow Traces (Latency Bottlenecks):
- In the Trace List View: Use the "Sort" dropdown to sort traces by "Execution time" in descending order. This will bring the slowest traces to the top.
- In the Detailed Trace View: Once you open a slow trace, examine the "Trace breakdown" panel. The waterfall display of spans will visually highlight operations that took the longest, helping you pinpoint latency bottlenecks within your application's flow.
-
Finding Traces from a Particular User:
- Using Filters: If you have tracked user information and it's available as a filter option (e.g., under "Attributes" or a dedicated "User" filter in the "Filters" dropdown), you can select or enter the specific user ID.
- Using Search/Tags: Alternatively, if user IDs are stored as tags (e.g.,
mlflow.trace.user
), use the search bar with a query liketags.mlflow.trace.user = 'user_example_123'
.
-
Locating Traces with Failures (Errors):
- Using Filters: In the "Filters" dropdown, select the
State
attribute and chooseERROR
to see only traces that failed. - In the Detailed Trace View: For an error trace, select the span marked with an error in the "Trace breakdown". Navigate to its "Events" tab in the Span Details panel to view the exception message and stack trace, which are crucial for diagnosing the root cause of the failure.
- Using Filters: In the "Filters" dropdown, select the
-
Identifying Traces with Negative Feedback or Issues (e.g., Incorrect Responses):
- Using Assessment Filters: If you are collecting user feedback or running evaluations that result in assessments (e.g., a boolean
is_correct
or a numericrelevance_score
), the "Filters" dropdown might allow you to filter by these assessment names and their values (e.g., filter foris_correct = false
orrelevance_score < 0.5
). - Viewing Assessments: Open a trace and check the "Assessments" panel (on the right in the detailed view) or individual span assessments. This will show any logged feedback, scores, and rationales, helping you understand why a response was marked as poor quality.
- Using Assessment Filters: If you are collecting user feedback or running evaluations that result in assessments (e.g., a boolean
These examples demonstrate how the detailed information captured by MLflow Tracing, combined with the UI's viewing and filtering capabilities, empowers you to efficiently debug issues and observe your application's behavior.
Databricks Notebook Integration for Traces
MLflow Tracing offers a seamless experience within Databricks notebooks, allowing you to view traces directly as part of your development and experimentation workflow.
The MLflow Tracing Databricks Notebook integration is available in MLflow 2.20 and above.
How it Works
When working in a Databricks notebook and your MLflow Tracking URI is set to "databricks"
(which is often the default or can be set via mlflow.set_tracking_uri("databricks")
), the trace UI can be automatically displayed in the output of a cell.
This typically occurs when:
- A cell's code execution generates a trace (e.g., by calling a function decorated with
@mlflow.trace
or an auto-instrumented library call). - You explicitly call
mlflow.search_traces()
and the result is displayed. - An
mlflow.entities.Trace
object (e.g., frommlflow.get_trace()
) is the last expression in a cell or is passed todisplay()
.
This in-notebook view provides the same rich, interactive trace exploration capabilities found in the main MLflow Experiments UI, helping you iterate faster without context switching.
Controlling Notebook Display
- To disable the automatic display of traces in notebook cell outputs, you can run:
mlflow.tracing.disable_notebook_display()
- To re-enable it, run:
mlflow.tracing.enable_notebook_display()
Programmatic Access and Management
Beyond the UI, MLflow provides APIs for interacting with traces programmatically:
- Query Traces via SDK: Learn how to search, filter, and retrieve trace data using Python for custom analysis, building evaluation datasets, or integration with other systems.
- Delete Traces: Understand how to remove traces based on specific criteria for data lifecycle management.
By combining UI-based exploration with programmatic access, you have a comprehensive toolkit for debugging your GenAI applications, understanding their behavior, and continuously improving their performance and quality.
Next steps
Continue your journey with these recommended actions and tutorials.
- Query traces via SDK - Programmatically search and analyze traces for custom workflows
- Use traces to improve quality - Leverage trace insights for application improvement
- Build evaluation datasets - Convert traces into test data for systematic evaluation
Reference guides
Explore detailed documentation for concepts and features mentioned in this guide.
- Tracing data model - Understand trace and span structure
- Delete traces - Learn about trace lifecycle management
- Logging assessments - Deep dive into feedback and assessment concepts