Monitor model quality and endpoint health

Databricks Model Serving provides advanced tooling for monitoring the quality and health of models and their deployments. The following table is an overview of each monitoring tool available.





Service logs

Captures stdout and stderr streams from the model serving endpoint.

Useful for debugging during model deployment. Use print(..., flush=true) for immediate display in the logs.

Accessible using the Logs tab in the Serving UI. Logs are streamed in real-time and can be exported through the API.

Build logs

Displays output from the process which automatically creates a production-ready Python environment for the model serving endpoint.

Useful for diagnosing model deployment and dependency issues.

Available upon completion of the model serving build under Build logs in the Logs tab. Logs can be exported through the API.

Endpoint health metrics

Provides insights into infrastructure metrics like latency, request rate, error rate, CPU usage, and memory usage.

Important for understanding the performance and health of the serving infrastructure.

Available by default in the Serving UI for the last 14 days. Data can also be streamed to observability tools in real-time.

Inference tables

Automatically logs online prediction requests and responses into Delta tables managed by Unity Catalog.

Use this tool for monitoring and debugging model quality or responses, generating training data sets, or conducting compliance audits.

Can be enabled for existing and new model-serving endpoints using a single click in the UI or API.