Review anomaly detection logged results
By default, data quality monitoring scan results are stored in the system.data_quality_monitoring.table_results table. Only account admins can access this table, and they must grant access to others as needed. Data quality monitoring uses default storage to store the anomaly detection results. You are not billed for the storage.
The results table system.data_quality_monitoring.table_results contains all results across the entire metastore and includes sample values from tables in each catalog. Use caution when granting access to this table.
Anomaly detection result table schema
Each row in the results table corresponds to a single table in the schema that was scanned.
The table has the following schema:
Column name  | Contents (for   | Data type  | Description  | Example data  | 
|---|---|---|---|---|
  | timestamp  | Time when the row was generated.  | 
  | |
  | string  | Name of the catalog. Used to identify the table.  | 
  | |
  | string  | Name of the schema. Used to identify the table.  | 
  | |
  | string  | Name of the table. Used to identify the table.  | 
  | |
  | string  | Stable ID for the catalog.  | 
  | |
  | string  | Stable ID for the schema.  | 
  | |
  | string  | Stable ID for the table.  | 
  | |
  | string  | Consolidated health status at the table level.   | 
  | |
  | struct  | Freshness checks.  | ||
  | string  | Overall freshness status.  | 
  | |
  | Commit freshness check results.  | |||
  | struct  | Completeness check results.  | ||
  | string  | Status of completeness check.  | 
  | |
  | Total number of rows in the table over time.  | |||
  | Number of rows added each day.  | |||
  | struct  | Summary of downstream impact based on dependency graph.  | ||
  | int  | Severity indicator (  | 
  | |
  | int  | Number of downstream tables affected.  | 
  | |
  | int  | Number of queries run on affected downstream tables over the last 30 days.  | 
  | |
  | struct  | Information about upstream jobs contributing to the issue.  | ||
  | Metadata for each upstream job.  | 
 commit_freshness array structure
The commit_freshness struct contains the following:
Item name  | Data type  | Description  | Example data  | 
|---|---|---|---|
  | string  | Status of commit freshness check.  | 
  | 
  | string  | Error message encountered during check.  | 
  | 
  | timestamp  | Last commit timestamp.  | 
  | 
  | timestamp  | Predicted time by which the table should have been updated.  | 
  | 
 total_row_count and daily_row_count array structure
The total_row_count and daily_row_count structs contain the following:
Item name  | Data type  | Description  | Example data  | 
|---|---|---|---|
  | string  | Status of the check.  | 
  | 
  | string  | Error message encountered during check.  | 
  | 
  | int  | Number of rows observed in the last 24 hours.  | 
  | 
  | int  | Minimum expected number of rows in the last 24 hours.  | 
  | 
  | int  | Maximum expected number of rows in the last 24 hours.  | 
  | 
 upstream_jobs array structure
The structure of the array shown in the upstream_jobs column is shown in the following table:
Item name  | Data type  | Description  | Example data  | 
|---|---|---|---|
  | string  | Job ID.  | 
  | 
  | string  | Workspace ID.  | 
  | 
  | string  | Job display name.  | 
  | 
  | string  | Status of the most recent run.  | 
  | 
  | string  | URL of Databricks job run page.  | 
  | 
Downstream impact information
In the logged results table, the column downstream_impact is a struct  with the following fields:
Field  | Type  | Description  | 
|---|---|---|
  | int  | Integer value between 1 and 4 indicating the severity of the data quality issue. Higher values indicate greater disruption.  | 
  | int  | Number of downstream tables that might be affected by the identified issue.  | 
  | int  | Total number of queries that have referenced the affected and downstream tables in the past 30 days.  |