Monitor model quality and endpoint health
Mosaic AI Model Serving provides advanced tooling for monitoring the quality and health of models and their deployments. The following table is an overview of each monitoring tool available.
Tool |
Description |
Purpose |
Access |
---|---|---|---|
Captures |
Useful for debugging during model deployment. Use |
Accessible using the Logs tab in the Serving UI. Logs are streamed in real-time and can be exported through the API. |
|
Displays output from the process which automatically creates a production-ready Python environment for the model serving endpoint. |
Useful for diagnosing model deployment and dependency issues. |
Available upon completion of the model serving build under Build logs in the Logs tab. Logs can be exported through the API. |
|
Provides insights into infrastructure metrics like latency, request rate, error rate, CPU usage, and memory usage. |
Important for understanding the performance and health of the serving infrastructure. |
Available by default in the Serving UI for the last 14 days. Data can also be streamed to observability tools in real-time. |
|
Automatically logs online prediction requests and responses into Delta tables managed by Unity Catalog for custom models. |
Use this tool for monitoring and debugging model quality or responses, generating training data sets, or conducting compliance audits. |
Can be enabled for existing and new model serving endpoints using a single click in the Serving UI or programmatically using Serving APIs. |
|
Automatically logs online prediction requests and responses into Delta tables managed by Unity Catalog for endpoints that serve external models or provisioned throughput workloads. |
Use this tool for monitoring and debugging model quality or responses, generating training data sets, or conducting compliance audits. |
Can be enabled for existing and new model serving endpoints when enabling AI Gateway features using the Serving UI or REST API. |