Skip to main content

Labeling during development

As a developer building GenAI applications, you need a way to track your observations about the quality of your application's outputs. MLflow Tracing allows you to add feedback or expectations directly to traces during development, giving you a quick way to record quality issues, mark successful examples, or add notes for future reference.

Prerequisites

  • Your application is instrumented with MLflow Tracing
  • You have generated traces by running your application

Add labels to traces via the UI

MLflow makes it easy to add annotations (labels) directly to traces through the MLflow UI.

note

If you are using a Databricks Notebook, you can also perform these steps from the Trace UI that renders inline in the notebook.

human feedback

  1. Navigate to the Traces tab in the MLflow Experiment UI
  2. Open an individual trace
  3. Within the trace UI, click on the specific span you want to label
    • Selecting the root span attaches feedback to the entire trace
  4. Expand the Assessments tab at the far right
  5. Fill in the form to add your feedback
    • Assessment Type
      • Feedback: Subjective assessment of quality (ratings, comments)
      • Expectation: The expected output or value (what should have been produced)
    • Assessment Name
      • A unique name for what the feedback is about
    • Data Type
      • Number
      • Boolean
      • String
    • Value
      • Your assessment
    • Rationale
      • Optional notes about the value
  6. Click Create to save your label
  7. When you return to the Traces tab, your label will appear as a new column

Add labels to traces via the SDK

You can programmatically add labels to traces using MLflow's SDK. This is useful for automated labeling based on your application logic or for batch processing of traces.

For a complete set of examples, see the logging assessments concept page.

Python

import mlflow
@mlflow.trace
def my_app(input: str) -> str:
return input + "_output"

my_app(input="hello")

trace_id = mlflow.get_last_active_trace_id()


# Log a thumbs up/down rating
mlflow.log_feedback(
trace_id=trace_id,
name="quality_rating",
value=1, # 1 for thumbs up, 0 for thumbs down
rationale="The response was accurate and helpful",
source=mlflow.entities.assessment.AssessmentSource(
source_type=mlflow.entities.assessment.AssessmentSourceType.HUMAN,
source_id="bob@example.com",
),
)

# Log expected response text
mlflow.log_expectation(
trace_id=trace_id,
name="expected_response",
value="The capital of France is Paris.",
source=mlflow.entities.assessment.AssessmentSource(
source_type=mlflow.entities.assessment.AssessmentSourceType.HUMAN,
source_id="bob@example.com",
),
)

human feedback

Next steps

Continue your journey with these recommended actions and tutorials.

Reference guides

Explore detailed documentation for concepts and features mentioned in this guide.