Search traces programmatically
Search and analyze traces programmatically using mlflow.search_traces(). This function can query traces stored in the MLflow tracking server, inference tables, or Unity Catalog tables. You can select subsets of traces to analyze or to create evaluation datasets.
mlflow.search_traces() API
def mlflow.search_traces(
experiment_ids: list[str] | None = None,
filter_string: str | None = None,
max_results: int | None = None,
order_by: list[str] | None = None,
extract_fields: list[str] | None = None,
run_id: str | None = None,
return_type: Literal['pandas', 'list'] | None = None,
model_id: str | None = None,
sql_warehouse_id: str | None = None,
include_spans: bool = True,
locations: list[str] | None = None,
) -> pandas.DataFrame | list[Trace]
mlflow.search_traces() lets you filter and select data along a few dimensions:
- Filter by a query string
- Filter by locations: experiment, run, model, or Unity Catalog schema
- Limit data: max results, include or exclude spans
- Adjust return value format: data format, data order
search_traces() returns either a pandas DataFrame or a list of Trace objects, which can then be analyzed further or reshaped into evaluation datasets. See the schema details of these return types.
See the mlflow.search_traces() API docs for full details.
mlflow.search_traces() parameters
Category |
| Description | Example |
|---|---|---|---|
Filter by query string |
| See the search query syntax including supported filters and comparators. |
|
Filter by locations |
| This argument can be list of experiment IDs or Unity Catalog |
|
| MLflow run ID |
| |
| MLflow model ID |
| |
Limit data |
| Max number of traces (rows) to return |
|
| Include or exclude spans from the results. Spans include trace details and can make result sizes much larger. |
| |
Return value format |
| See the syntax and supported keys. |
|
| This function can return either a pandas DataFrame or a list of |
| |
Deprecated |
| Use | |
| Select fields in the returned DataFrame or trace objects instead. | ||
| Use the |
Best practices
Keyword arguments
Always use keyword (named) arguments with mlflow.search_traces(). It allows positional arguments, but the function arguments are evolving.
Good practice: mlflow.search_traces(filter_string="attributes.status = 'OK'")
Bad practice: mlflow.search_traces([], "attributes.status = 'OK'")
filter_string gotchas
When searching using the filter_string argument to mlflow.search_traces(), remember to:
- Use prefixes:
attributes.,tags., ormetadata. - Use backticks if tag or attribute names have dots:
tags.`mlflow.traceName` - Use single quotes only:
'value'not"value" - Use Unix timestamp (milliseconds) for time:
1749006880539not dates - Use AND only: No OR support
See the search query syntax for further details.
SQL warehouse integration
mlflow.search_traces() can optionally use a Databricks SQL warehouse to improve performance on large trace datasets in inference tables or Unity Catalog tables. Specify your SQL warehouse ID using the MLFLOW_TRACING_SQL_WAREHOUSE_ID environment variable.
Execute trace queries using a Databricks SQL warehouse for improved performance on large trace datasets:
import os
os.environ['MLFLOW_TRACING_SQL_WAREHOUSE_ID'] = 'fa92bea7022e81fb'
# Use SQL warehouse for better performance
traces = mlflow.search_traces(
filter_string="attributes.status = 'OK'",
locations=['my_catalog.my_schema'],
)
Pagination
mlflow.search_traces() returns results in memory, which works well for smaller result sets. To handle large result sets, use MlflowClient.search_traces() since it supports pagination.
Next steps
- Tutorial: Search traces programmatically - Run a set of simple examples of
mlflow.search_traces() - Tutorial: Trace and analyze users and environments - Run an example of adding context metadata to traces and analyzing the results
- Examples: Analyzing traces - See a variety of examples of trace analysis
- Build evaluation datasets - Convert queried traces into test datasets