Skip to main content

data-quality command group

note

This information applies to Databricks CLI versions 0.205 and above. The Databricks CLI is in Public Preview.

Databricks CLI use is subject to the Databricks License and Databricks Privacy Notice, including any Usage Data provisions.

The data-quality command group within the Databricks CLI contains commands to manage the data quality of Unity Catalog objects.

databricks data-quality cancel-refresh

Cancel a data quality monitor refresh. Currently only supported for the table object_type. The call must be made in the same workspace as where the monitor was created.

The caller must have either of the following sets of permissions:

  1. MANAGE and USE_CATALOG on the table's parent catalog.
  2. USE_CATALOG on the table's parent catalog, and MANAGE and USE_SCHEMA on the table's parent schema.
  3. USE_CATALOG on the table's parent catalog, USE_SCHEMA on the table's parent schema, and MANAGE on the table.
databricks data-quality cancel-refresh OBJECT_TYPE OBJECT_ID REFRESH_ID [flags]

Arguments

OBJECT_TYPE

    The type of the monitored object. Can be one of the following: schema or table.

OBJECT_ID

    The UUID of the request object. It is schema_id for schema, and table_id for table.

    Find the schema_id from either: (1) The schema_id of the Schemas resource. (2) In Catalog Explorer → select the schema → go to the Details tab → the Schema ID field.

    Find the table_id from either: (1) The table_id of the Tables resource. (2) In Catalog Explorer → select the table → go to the Details tab → the Table ID field.

REFRESH_ID

    Unique id of the refresh operation.

Options

Global flags

Examples

The following example cancels a refresh operation:

Bash
databricks data-quality cancel-refresh table a1b2c3d4-e5f6-7890-a1b2-c3d4e5f67890 refresh-12345

databricks data-quality create-monitor

Create a data quality monitor on a Unity Catalog object. The caller must provide either anomaly_detection_config for a schema monitor or data_profiling_config for a table monitor.

For the table object_type, the caller must have either of the following sets of permissions:

  1. MANAGE and USE_CATALOG on the table's parent catalog, USE_SCHEMA on the table's parent schema, and SELECT on the table.
  2. USE_CATALOG on the table's parent catalog, MANAGE and USE_SCHEMA on the table's parent schema, and SELECT on the table.
  3. USE_CATALOG on the table's parent catalog, USE_SCHEMA on the table's parent schema, and MANAGE and SELECT on the table.

Workspace assets, such as the dashboard, will be created in the workspace where this call was made.

For the schema object_type, the caller must have either of the following sets of permissions:

  1. MANAGE and USE_CATALOG on the schema's parent catalog.
  2. USE_CATALOG on the schema's parent catalog, and MANAGE and USE_SCHEMA on the schema.
databricks data-quality create-monitor OBJECT_TYPE OBJECT_ID [flags]

Arguments

OBJECT_TYPE

    The type of the monitored object. Can be one of the following: schema or table.

OBJECT_ID

    The UUID of the request object. It is schema_id for schema, and table_id for table.

    Find the schema_id from either: (1) The schema_id of the Schemas resource. (2) In Catalog Explorer → select the schema → go to the Details tab → the Schema ID field.

    Find the table_id from either: (1) The table_id of the Tables resource. (2) In Catalog Explorer → select the table → go to the Details tab → the Table ID field.

Options

--json JSON

    The inline JSON string or the @path to the JSON file with the request body

Global flags

Examples

The following example creates a data quality monitor for a table:

Bash
databricks data-quality create-monitor table a1b2c3d4-e5f6-7890-a1b2-c3d4e5f67890 --json '{"data_profiling_config": {"enabled": true}}'

The following example creates a monitor using a JSON file:

Bash
databricks data-quality create-monitor table a1b2c3d4-e5f6-7890-a1b2-c3d4e5f67890 --json @monitor-config.json

databricks data-quality create-refresh

Create a refresh. The call must be made in the same workspace as where the monitor was created.

The caller must have either of the following sets of permissions:

  1. MANAGE and USE_CATALOG on the table's parent catalog.
  2. USE_CATALOG on the table's parent catalog, and MANAGE and USE_SCHEMA on the table's parent schema.
  3. USE_CATALOG on the table's parent catalog, USE_SCHEMA on the table's parent schema, and MANAGE on the table.
databricks data-quality create-refresh OBJECT_TYPE OBJECT_ID [flags]

Arguments

OBJECT_TYPE

    The type of the monitored object. Can be one of the following: schema or table.

OBJECT_ID

    The UUID of the request object. It is schema_id for schema, and table_id for table.

    Find the schema_id from either: (1) The schema_id of the Schemas resource. (2) In Catalog Explorer → select the schema → go to the Details tab → the Schema ID field.

    Find the table_id from either: (1) The table_id of the Tables resource. (2) In Catalog Explorer → select the table → go to the Details tab → the Table ID field.

Options

--json JSON

    The inline JSON string or the @path to the JSON file with the request body

Global flags

Examples

The following example creates a refresh for a table monitor:

Bash
databricks data-quality create-refresh table a1b2c3d4-e5f6-7890-a1b2-c3d4e5f67890

The following example creates a refresh using JSON:

Bash
databricks data-quality create-refresh table a1b2c3d4-e5f6-7890-a1b2-c3d4e5f67890 --json '{}'

databricks data-quality delete-monitor

Delete a data quality monitor on Unity Catalog object.

For the table object_type, the caller must have either of the following sets of permissions:

  1. MANAGE and USE_CATALOG on the table's parent catalog.
  2. USE_CATALOG on the table's parent catalog, and MANAGE and USE_SCHEMA on the table's parent schema.
  3. USE_CATALOG on the table's parent catalog, USE_SCHEMA on the table's parent schema, and MANAGE on the table.
important

The metric tables and dashboard will not be deleted as part of this call; those assets must be manually cleaned up (if desired).

For the schema object_type, the caller must have either of the following sets of permissions:

  1. MANAGE and USE_CATALOG on the schema's parent catalog.
  2. USE_CATALOG on the schema's parent catalog, and MANAGE and USE_SCHEMA on the schema.
databricks data-quality delete-monitor OBJECT_TYPE OBJECT_ID [flags]

Arguments

OBJECT_TYPE

    The type of the monitored object. Can be one of the following: schema or table.

OBJECT_ID

    The UUID of the request object. It is schema_id for schema, and table_id for table.

    Find the schema_id from either: (1) The schema_id of the Schemas resource. (2) In Catalog Explorer → select the schema → go to the Details tab → the Schema ID field.

    Find the table_id from either: (1) The table_id of the Tables resource. (2) In Catalog Explorer → select the table → go to the Details tab → the Table ID field.

Options

Global flags

Examples

The following example deletes a data quality monitor:

Bash
databricks data-quality delete-monitor table a1b2c3d4-e5f6-7890-a1b2-c3d4e5f67890

databricks data-quality get-monitor

Read a data quality monitor on a Unity Catalog object.

For the table object_type, the caller must have either of the following sets of permissions:

  1. MANAGE and USE_CATALOG on the table's parent catalog.
  2. USE_CATALOG on the table's parent catalog, and MANAGE and USE_SCHEMA on the table's parent schema.
  3. USE_CATALOG on the table's parent catalog, USE_SCHEMA on the table's parent schema, and SELECT on the table.

For the schema object_type, the caller must have either of the following sets of permissions:

  1. MANAGE and USE_CATALOG on the schema's parent catalog.
  2. USE_CATALOG on the schema's parent catalog, and USE_SCHEMA on the schema.

The returned information includes configuration values on the entity and parent entity as well as information on assets created by the monitor. Some information (e.g. dashboard) may be filtered out if the caller is in a different workspace than where the monitor was created.

databricks data-quality get-monitor OBJECT_TYPE OBJECT_ID [flags]

Arguments

OBJECT_TYPE

    The type of the monitored object. Can be one of the following: schema or table.

OBJECT_ID

    The UUID of the request object. It is schema_id for schema, and table_id for table.

    Find the schema_id from either: (1) The schema_id of the Schemas resource. (2) In Catalog Explorer → select the schema → go to the Details tab → the Schema ID field.

    Find the table_id from either: (1) The table_id of the Tables resource. (2) In Catalog Explorer → select the table → go to the Details tab → the Table ID field.

Options

Global flags

Examples

The following example gets information about a data quality monitor:

Bash
databricks data-quality get-monitor table a1b2c3d4-e5f6-7890-a1b2-c3d4e5f67890

databricks data-quality get-refresh

Get data quality monitor refresh information. The call must be made in the same workspace as where the monitor was created.

For the table object_type, the caller must have either of the following sets of permissions:

  1. MANAGE and USE_CATALOG on the table's parent catalog.
  2. USE_CATALOG on the table's parent catalog, and MANAGE and USE_SCHEMA on the table's parent schema.
  3. USE_CATALOG on the table's parent catalog, USE_SCHEMA on the table's parent schema, and SELECT on the table.

For the schema object_type, the caller must have either of the following sets of permissions:

  1. MANAGE and USE_CATALOG on the schema's parent catalog.
  2. USE_CATALOG on the schema's parent catalog, and USE_SCHEMA on the schema.
databricks data-quality get-refresh OBJECT_TYPE OBJECT_ID REFRESH_ID [flags]

Arguments

OBJECT_TYPE

    The type of the monitored object. Can be one of the following: schema or table.

OBJECT_ID

    The UUID of the request object. It is schema_id for schema, and table_id for table.

    Find the schema_id from either: (1) The schema_id of the Schemas resource. (2) In Catalog Explorer → select the schema → go to the Details tab → the Schema ID field.

    Find the table_id from either: (1) The table_id of the Tables resource. (2) In Catalog Explorer → select the table → go to the Details tab → the Table ID field.

REFRESH_ID

    Unique id of the refresh operation.

Options

Global flags

Examples

The following example gets information about a refresh:

Bash
databricks data-quality get-refresh table a1b2c3d4-e5f6-7890-a1b2-c3d4e5f67890 refresh-12345

databricks data-quality list-refresh

List data quality monitor refreshes. The call must be made in the same workspace as where the monitor was created.

For the table object_type, the caller must have either of the following sets of permissions:

  1. MANAGE and USE_CATALOG on the table's parent catalog.
  2. USE_CATALOG on the table's parent catalog, and MANAGE and USE_SCHEMA on the table's parent schema.
  3. USE_CATALOG on the table's parent catalog, USE_SCHEMA on the table's parent schema, and SELECT on the table.

For the schema object_type, the caller must have either of the following sets of permissions:

  1. MANAGE and USE_CATALOG on the schema's parent catalog.
  2. USE_CATALOG on the schema's parent catalog, and USE_SCHEMA on the schema.
databricks data-quality list-refresh OBJECT_TYPE OBJECT_ID [flags]

Arguments

OBJECT_TYPE

    The type of the monitored object. Can be one of the following: schema or table.

OBJECT_ID

    The UUID of the request object. It is schema_id for schema, and table_id for table.

    Find the schema_id from either: (1) The schema_id of the Schemas resource. (2) In Catalog Explorer → select the schema → go to the Details tab → the Schema ID field.

    Find the table_id from either: (1) The table_id of the Tables resource. (2) In Catalog Explorer → select the table → go to the Details tab → the Table ID field.

Options

--page-size int

    Maximum number of refreshes to return per page.

--page-token string

    Token to retrieve the next page of results.

Global flags

Examples

The following example lists all refreshes for a monitor:

Bash
databricks data-quality list-refresh table a1b2c3d4-e5f6-7890-a1b2-c3d4e5f67890

The following example lists refreshes with pagination:

Bash
databricks data-quality list-refresh table a1b2c3d4-e5f6-7890-a1b2-c3d4e5f67890 --page-size 10

databricks data-quality update-monitor

Update a data quality monitor on Unity Catalog object.

For the table object_type, the caller must have either of the following sets of permissions:

  1. MANAGE and USE_CATALOG on the table's parent catalog.
  2. USE_CATALOG on the table's parent catalog, and MANAGE and USE_SCHEMA on the table's parent schema.
  3. USE_CATALOG on the table's parent catalog, USE_SCHEMA on the table's parent schema, and MANAGE on the table.

For the schema object_type, the caller must have either of the following sets of permissions:

  1. MANAGE and USE_CATALOG on the schema's parent catalog.
  2. USE_CATALOG on the schema's parent catalog, and MANAGE and USE_SCHEMA on the schema.
databricks data-quality update-monitor OBJECT_TYPE OBJECT_ID UPDATE_MASK OBJECT_TYPE OBJECT_ID [flags]

Arguments

OBJECT_TYPE

    The type of the monitored object. Can be one of the following: schema or table.

OBJECT_ID

    The UUID of the request object. It is schema_id for schema, and table_id for table.

UPDATE_MASK

    The field mask to specify which fields to update as a comma-separated list. Example value: data_profiling_config.custom_metrics,data_profiling_config.schedule.quartz_cron_expression.

Options

--json JSON

    The inline JSON string or the @path to the JSON file with the request body

Global flags

Examples

The following example updates a monitor's configuration:

Bash
databricks data-quality update-monitor table a1b2c3d4-e5f6-7890-a1b2-c3d4e5f67890 "data_profiling_config.schedule.quartz_cron_expression" table a1b2c3d4-e5f6-7890-a1b2-c3d4e5f67890 --json '{"data_profiling_config": {"schedule": {"quartz_cron_expression": "0 0 12 * * ?"}}}'

Global flags

Global flags

--debug

  Whether to enable debug logging.

-h or --help

    Display help for the Databricks CLI or the related command group or the related command.

--log-file string

    A string representing the file to write output logs to. If this flag is not specified then the default is to write output logs to stderr.

--log-format format

    The log format type, text or json. The default value is text.

--log-level string

    A string representing the log format level. If not specified then the log format level is disabled.

-o, --output type

    The command output type, text or json. The default value is text.

-p, --profile string

    The name of the profile in the ~/.databrickscfg file to use to run the command. If this flag is not specified then if it exists, the profile named DEFAULT is used.

--progress-format format

    The format to display progress logs: default, append, inplace, or json

-t, --target string

    If applicable, the bundle target to use