Skip to main content

Data quality monitoring results system table reference

Beta

This feature is in Beta.

This page outlines the data quality monitoring results system table schema and includes sample queries. The table stores results of freshness and completeness checks, as well as downstream impact and root cause analysis, across all tables enabled for data quality monitoring in your metastore.

Table path: system.data_quality_monitoring.table_results

Only account admins can access this table, and they must grant access to others as needed. The system table uses default storage. Because it contains sample values and downstream usage data, use caution when granting access to others.

Data quality monitoring results table schema

The system.data_quality_monitoring.table_results table uses the following schema:

Column name

Contents (for struct data type)

Data type

Description

Example data

event_time

timestamp

Time when the row was generated.

2025-06-27T12:00:00

catalog_name

string

Name of the catalog. Used to identify the table.

main

schema_name

string

Name of the schema. Used to identify the table.

default

table_name

string

Name of the table. Used to identify the table.

events

catalog_id

string

Stable ID for the catalog.

3f1a7d6e-9c59-4b76-8c32-8d4c74e289fe

schema_id

string

Stable ID for the schema.

3f1a7d6e-9c59-4b76-8c32-8d4c74e289fe

table_id

string

Stable ID for the table.

3f1a7d6e-9c59-4b76-8c32-8d4c74e289fe

status

string

Consolidated health status at the table level. "Unhealthy" if any check or group is unhealthy.

Healthy, Unhealthy, Unknown

freshness

struct

Freshness checks.

status

string

Overall freshness status.

Unhealthy

commit_freshness

struct

Commit freshness check results.

completeness

struct

Completeness check results.

status

string

Status of completeness check.

Unhealthy

total_row_count

struct

Total number of rows in the table over time.

daily_row_count

struct

Number of rows added each day.

downstream_impact

struct

Summary of downstream impact based on dependency graph.

impact_level

int

Severity indicator (0 = none, 1 = low, 2 = medium, 3 = high, 4 = very high).

2

num_downstream_tables

int

Number of downstream tables affected.

5

num_queries_on_affected_tables

int

Number of queries run on affected downstream tables over the last 30 days.

120

root_cause_analysis

struct

Information about upstream jobs contributing to the issue.

upstream_jobs

array

Metadata for each upstream job.

commit_freshness array structure

The commit_freshness struct contains the following:

Item name

Data type

Description

Example data

status

string

Status of commit freshness check.

Unhealthy

error_code

string

Error message encountered during check.

FAILED_TO_FIT_MODEL

last_value

timestamp

Last commit timestamp.

2025-06-27T11:30:00

predicted_value

timestamp

Predicted time by which the table should have been updated.

2025-06-27T11:45:00

total_row_count and daily_row_count array structure

The total_row_count and daily_row_count structs contain the following:

Item name

Data type

Description

Example data

status

string

Status of the check.

Unhealthy

error_code

string

Error message encountered during check.

FAILED_TO_FIT_MODEL

last_value

int

Number of rows observed in the last 24 hours.

500

min_predicted_value

int

Minimum expected number of rows in the last 24 hours.

10

max_predicted_value

int

Maximum expected number of rows in the last 24 hours.

1000

upstream_jobs array structure

The structure of the array shown in the upstream_jobs column is shown in the following table:

Item name

Data type

Description

Example data

job_id

string

Job ID.

12345

workspace_id

string

Workspace ID.

6051921418418893

job_name

string

Job display name.

daily_refresh

last_run_status

string

Status of the most recent run.

SUCCESS

run_page_url

string

URL of Databricks job run page.

https://.../runs/123

Downstream impact information

In the logged results table, the column downstream_impact is a struct with the following fields:

Field

Type

Description

impact_level

int

Integer value between 1 and 4 indicating the severity of the data quality issue. Higher values indicate greater disruption.

num_downstream_tables

int

Number of downstream tables that might be affected by the identified issue.

num_queries_on_affected_tables

int

Total number of queries that have referenced the affected and downstream tables in the past 30 days.

Example queries

Replace parameter values with your own before running.

Get all current incidents in a schema

SQL
WITH latest_rows AS (
SELECT
*,
ROW_NUMBER() OVER (
PARTITION BY table_id
ORDER BY event_time DESC
) AS rn
FROM
system.data_quality_monitoring.table_results
WHERE
catalog_name = "c"
AND schema_name = "s"
)

SELECT *
FROM latest_rows
WHERE
rn = 1
AND status = "Unhealthy"

Get all incident tables in a schema that have a high downstream impact

SQL
WITH latest_rows AS (
SELECT
*,
ROW_NUMBER() OVER (
PARTITION BY table_id
ORDER BY event_time DESC
) AS rn
FROM
system.data_quality_monitoring.table_results
WHERE
catalog_name = "c"
AND schema_name = "s"
)
SELECT *
FROM latest_rows
WHERE rn = 1
AND downstream_impact.impact_level >= 3

Get all tables in a schema that are currently impacted by a freshness issue

SQL
WITH latest_rows AS (
SELECT
*,
ROW_NUMBER() OVER (
PARTITION BY table_id
ORDER BY event_time DESC
) AS rn
FROM
system.data_quality_monitoring.table_results
WHERE
catalog_name = "c"
AND schema_name = "s"
)

SELECT *
FROM latest_rows
WHERE rn = 1
AND freshness.status = "Unhealthy"

Get all historical records for a table

SQL
SELECT *
FROM system.data_quality_monitoring.table_results
WHERE
catalog_name = "c"
AND schema_name = "s"
AND table_name = "t"