Deploy a traced app
MLflow tracing helps you monitor GenAI applications in production by capturing execution details. You can deploy traced applications in two ways:
- On Databricks: Deploy using Agent Framework or custom model serving with full integration for monitoring and inference tables
- Outside Databricks: Deploy to external environments while logging traces back to Databricks for monitoring
Compare deployment options
The table below compares trace logging options available for each deployment location:
Deployment location | |||
|---|---|---|---|
Supported | Supported | Supported | |
Supported | Supported | Not supported |
Compare trace logging options
The deployment options table above lists multiple options for trace logging. The table below compares these trace logging options:
Trace logging option | Access and governance | Latency* | Throughput* | Size limits* |
|---|---|---|---|---|
Traces can be viewed in the MLflow experiment UI or queried programmatically. Access is governed by MLflow experiment ACLs.† | Real-time | Max 60 queries per second (QPS) | Supports very large traces. Max 100K traces per experiment. | |
Traces logged to Delta tables are governed using Unity Catalog privileges. | ~15 minute delay | Max 50 queries per second (QPS) | Supports very large traces. Max 100K traces per experiment. | |
Traces logged to Delta tables are governed using Unity Catalog privileges. | 30-90 minute delay | QPS limits match model serving endpoint limits | Limits on trace size. No limit on traces per experiment. |
* See Resource limits for other platform limits, as well as information about which limits can be raised by working with your Databricks account team.
† For MLflow experiment logging, traces are stored as artifacts, for which you can specify a custom storage location. For example, if you create a workspace experiment with artifact_location set to a Unity Catalog volume, then trace data access is governed by Unity Catalog volume privileges.
Next steps
Choose your deployment approach: