Skip to main content

Data classification system table reference

Beta

This feature is in Beta.

This page outlines the data classification results table schema and includes sample queries. The table stores detections for sensitive data classes at the column level across enabled catalogs in your metastore.

Table path: system.data_classification.results

Data classification results table schema

The data classification results system table uses the following schema:

Column name

Data type

Description

Example

latest_detected_time

timestamp

As-of time when the column was most recently scanned.

2025-06-27T12:34

first_detected_time

timestamp

Time when the column detection was first recorded.

2025-06-27T12:34

catalog_id

string

ID of the catalog.

3f1a7d6e-9c59-...

table_id

string

ID of the table.

3f1a7d6e-9c59-...

catalog_name

string

Catalog name.

main_catalog

schema_name

string

Schema name.

public

table_name

string

Table name.

sales_data

column_name

string

Column name.

customer_email

data_type

string

Data type of the column. Complex types include full struct definitions.

struct<name:string, age:int>

class_tag

string

Tag for the detected entity or tag key and optional value.

class.us_ssn or pii: confidential

samples

array<string>

Up to five sample values that matched the detection.

["a@b.com", ...]

confidence

string

Confidence of detection. Either HIGH or LOW

HIGH

frequency

float

Estimation of the proportion of matching rows in the sample. Between 0 and 1.

0.87

Example queries

Replace parameter values with your own before running.

Get all detections for a table

SQL
SELECT *
FROM system.data_classification.results
WHERE
catalog_name = "c"
AND schema_name = "s"
AND table_name = "t";

Get all high confidence detections

SQL
SELECT *
FROM system.data_classification.results
WHERE
catalog_name = "c"
AND schema_name = "s"
AND table_name = "t"
AND confidence = "HIGH";

Get number of tables impacted by a specific classification

SQL
SELECT
class_tag,
COUNT(DISTINCT catalog_name, schema_name, table_name) AS num_tables
FROM
system.data_classification.results
WHERE
class_tag IS NOT NULL
GROUP BY class_tag;

Get number of users who queried tables with sensitive data in last 30 days

SQL
WITH table_accesses AS (
SELECT
IFNULL(
request_params.full_name_arg,
CONCAT(request_params.catalog_name, '.', request_params.schema_name, '.', request_params.name)
) AS table_full_name,
COUNT(DISTINCT user_identity.email) AS num_users
FROM
system.access.audit
WHERE
action_name IN ("createTable", "getTable", "updateTable", "deleteTable")
AND (
-- For performance, limit the blast radius of the audit log query to only the current catalog
request_params.catalog_name = :catalog_name OR
request_params.full_name_arg LIKE :catalog_name || '%'
)
AND event_time >= DATE_SUB(current_date(), 30)
GROUP BY table_full_name
),
sensitive_tables AS (
SELECT
DISTINCT CONCAT(catalog_name, '.', schema_name, '.', table_name) AS table_full_name
FROM
system.data_classification.results
WHERE class_tag IS NOT NULL
)

SELECT
st.table_full_name,
ta.num_users
FROM
sensitive_tables st
JOIN table_accesses ta
ON st.table_full_name = ta.table_full_name