Information schema (Databricks SQL)

Preview

Unity Catalog is in Public Preview. To participate in the preview, contact your Databricks representative.

The INFORMATION_SCHEMA is a SQL Standard based, system provided schema present in every catalog other than the HIVE_METASTORE catalog.

Within the information schema, you can find a set of views describing the objects known to the schema’s catalog that you are privileged the see. The information schema of the SYSTEM catalog returns information about objects across all catalogs within the metastore.

The purpose of the information schema is to provide a SQL based, self describing API to the metadata.

Entity relationship diagram of the information schema

The following entity relationship (ER) diagram provides an overview of the relations within the information schema and how they relate to each other.

ER diagram of information schema

Information schema views

Name

Description

CATALOG_PRIVILEGES

Lists principals which have privileges on the catalog.

CATALOGS

Describes catalogs.

COLUMNS

Describes columns of tables and views in the catalog.

CHECK_CONTRAINTS

Reserved for future use.

INFORMATION_SCHEMA_CATALOG_NAME

Returns the name of this information schema’s catalog.

REFERENTIAL_CONSTRAINTS

Reserved for future use.

TABLE_PRIVILEGES

Lists principals which have privileges on the tables and views in the catalog.

TABLES

Describes tables and views defined within the catalog.

SCHEMA_PRIVILEGES

Lists principals which have privileges on the schemas in the catalog.

SCHEMATA

Describes schemas within the catalog.

VIEWS

Describes view specific information about the views in the catalog.

Notes

While identifiers are case-insensitive when referenced in SQL statements, they are stored in the information schema as STRING. This implies that you must either search for them using the case in which the identifier is stored, or use functions such as ilike.

Examples

> SELECT table_name, column_name
    FROM information_schema.views
    WHERE data_type = 'DOUBLE'
      AND schema_name = 'information_schema';

The following are examples of workflows that use the system level information schema tables.

If you want to view all tables that have been created in the last 24 hours, your query could look like the following.

> SELECT table_name, table_owner, created_by, last_altered, last_altered_by, table_catalog
    FROM system.information_schema.tables
    WHERE  datediff(now(), last_altered) < 1;

If you want to view how many tables you have in each schema, consider the following example.

> SELECT table_schema, count(table_name)
    FROM system.information_schema.tables
    WHERE table_schema = 'tphc'
    GROUP BY table_schema
    ORDER BY 2 DESC