Skip to main content

Search traces programmatically

Search and analyze traces programmatically using mlflow.search_traces(). This function can query traces stored in the MLflow tracking server, inference tables, or Unity Catalog tables. You can select subsets of traces to analyze or to create evaluation datasets.

mlflow.search_traces() API

Python
def mlflow.search_traces(
experiment_ids: list[str] | None = None,
filter_string: str | None = None,
max_results: int | None = None,
order_by: list[str] | None = None,
extract_fields: list[str] | None = None,
run_id: str | None = None,
return_type: Literal['pandas', 'list'] | None = None,
model_id: str | None = None,
sql_warehouse_id: str | None = None,
include_spans: bool = True,
locations: list[str] | None = None,
) -> pandas.DataFrame | list[Trace]

mlflow.search_traces() lets you filter and select data along a few dimensions:

  • Filter by a query string
  • Filter by locations: experiment, run, model, or Unity Catalog schema
  • Limit data: max results, include or exclude spans
  • Adjust return value format: data format, data order

search_traces() returns either a pandas DataFrame or a list of Trace objects, which can then be analyzed further or reshaped into evaluation datasets. See the schema details of these return types.

See the mlflow.search_traces() API docs for full details.

note

Databricks-managed MLflow and OSS (open source software) MLflow share most search query syntax but have a few field-level differences. See Differences from OSS MLflow for details.

mlflow.search_traces() parameters

Category

parameter: type

Description

Example

Filter by query string

filter_string: str

See Search query syntax for supported filters and comparators.

trace.status = 'OK' AND tag.environment = 'production'

Filter by locations

locations: list[str]

This argument can be list of experiment IDs or Unity Catalog catalog.schema locations for filtering. Use this to search traces stored in inference or Unity Catalog tables.

['591498498138889', '782498488231546'] or ['my_catalog.my_schema']

run_id: str

MLflow run ID

35464a26b0144533b09d8acbb4681985

model_id: str

MLflow model ID

acc4c426-5dd7-4a3a-85de-da1b22ce05f1

Limit data

max_results: int

Max number of traces (rows) to return

100

include_spans: bool

Include or exclude spans from the results. Spans include trace details and can make result sizes much larger.

True

Return value format

order_by: list[str]

See the syntax and supported keys.

["timestamp_ms DESC", "status ASC"]

return_type: Literal['pandas', 'list']

This function can return either a pandas DataFrame or a list of Trace objects. See schema details.

'pandas'

Deprecated

experiment_ids: list[str]

Use locations instead.

extract_fields: list[str]

Select fields in the returned DataFrame or trace objects instead.

sql_warehouse_id: str

Use the MLFLOW_TRACING_SQL_WAREHOUSE_ID environment variable instead.

Search query syntax

The filter_string argument uses a SQL-like query language to filter traces. String values must be wrapped in single quotes (for example, trace.status = 'OK'), and numeric values must not be quoted (for example, trace.execution_time_ms > 1000). Combine conditions with AND. The OR operator is not supported.

Supported filters and comparators

The following fields and comparators are supported on Databricks-managed MLflow.

note

Filters marked (UC only) are supported only for MLflow traces stored in Unity Catalog. See Store OpenTelemetry traces in Unity Catalog.

Field type

Fields

Comparators

Example

Trace status

trace.status

=, !=

trace.status = 'OK'

Trace timestamps

trace.timestamp_ms, trace.execution_time_ms, trace.end_time_ms (UC only)

=, !=, >, <, >=, <=

trace.end_time_ms > 1762408895531

Trace IDs

trace.run_id

=

trace.run_id = 'run_id'

String fields

trace.client_request_id (UC only), trace.name

=, !=, LIKE, ILIKE, RLIKE

trace.name LIKE '%Generate%'

Request and response content (UC only)

trace.request, trace.response

=, !=, LIKE, ILIKE, RLIKE

trace.request LIKE '%weather%'

Token count (UC only)

trace.token_count

=, !=, >, <, >=, <=

trace.token_count > 1000

Linked prompts

prompt

= (format: 'name/version')

prompt = 'qa-system-prompt/4'

Span name, type, and status (UC only)

span.name, span.type, span.status

=, !=, LIKE, ILIKE, RLIKE

span.type RLIKE '^LLM'

OTel span attributes (UC only)

span.attributes.<key>

=, !=, LIKE, ILIKE, RLIKE, IS NULL, IS NOT NULL

span.attributes.gen_ai.request.model = 'gpt-4'

Tags

tag.<key>

=, !=, LIKE, ILIKE, RLIKE, IS NULL, IS NOT NULL

For MLflow traces stored in an experiment (not in Unity Catalog), only = and != are supported.

tag.environment = 'production'

Metadata

metadata.<key>

=, !=, LIKE, ILIKE, RLIKE, IS NULL, IS NOT NULL

For MLflow traces stored in an experiment (not in Unity Catalog), only = and != are supported.

metadata.`mlflow.trace.user` = 'user_123'

Feedback (UC only)

feedback.<name>

=, !=, LIKE, ILIKE, RLIKE

feedback.rating = 'excellent'

Expectations (UC only)

expectation.<name>

=, !=, LIKE, ILIKE, RLIKE

expectation.result = 'pass'

Differences from OSS MLflow

The search query syntax on Databricks-managed MLflow closely tracks OSS MLflow, with the following differences:

Field

Databricks-managed MLflow

OSS MLflow

Notes

trace.request, trace.response

Supported (UC only)

Not supported

Use these fields to filter on serialized request and response content.

trace.token_count

Supported (UC only)

Not supported

Filter traces by total token count.

span.attributes.<key>

Supported (UC only)

Not supported

Filter traces by OpenTelemetry span attributes.

trace.text

Not supported

Supported (SQLAlchemy store only)

OSS exposes trace.text for full-text search across trace content. On Databricks, use trace.request and trace.response to filter on trace content instead.

trace.prompt

Not supported

Supported (mapped to linked prompts tag)

On Databricks, use the top-level prompt field.

trace.request_id

Not supported

Supported

On Databricks, use trace.client_request_id instead.

issue.id

Not supported

Supported

Filter traces linked to a specific issue ID.

Search for third-party OpenTelemetry spans

To search traces ingested from third-party OpenTelemetry tools such as Langfuse, use the span.attributes.* prefix instead. See Search for traces by OTel span attributes.

Best practices

Keyword arguments

Always use keyword (named) arguments with mlflow.search_traces(). It allows positional arguments, but the function arguments are evolving.

Good practice: mlflow.search_traces(filter_string="trace.status = 'OK'")

Bad practice: mlflow.search_traces([], "trace.status = 'OK'")

filter_string gotchas

When searching using the filter_string argument to mlflow.search_traces(), remember to:

  • Use prefixes: trace., tag., or metadata.
  • Use backticks if tag or attribute names have dots: tag.`mlflow.traceName`
  • Use single quotes only: 'value' not "value"
  • Use Unix timestamp (milliseconds) for time: 1749006880539 not dates
  • Use AND only: No OR support

See Search query syntax for the full list of supported fields and operators.

SQL warehouse integration

mlflow.search_traces() can optionally use a Databricks SQL warehouse to improve performance on large trace datasets in inference tables or Unity Catalog tables. Specify your SQL warehouse ID using the MLFLOW_TRACING_SQL_WAREHOUSE_ID environment variable.

Execute trace queries using a Databricks SQL warehouse for improved performance on large trace datasets:

Python
import os

os.environ['MLFLOW_TRACING_SQL_WAREHOUSE_ID'] = 'fa92bea7022e81fb'

# Use SQL warehouse for better performance
traces = mlflow.search_traces(
filter_string="trace.status = 'OK'",
locations=['my_catalog.my_schema'],
)

Pagination

mlflow.search_traces() returns results in memory, which works well for smaller result sets. To handle large result sets, use MlflowClient.search_traces() since it supports pagination.

Next steps