REORG TABLE

Applies to: check marked yes Databricks SQL check marked yes Databricks Runtime 11.3 LTS and above

Reorganize a Delta Lake table by rewriting files to purge soft-deleted data, such as the column data dropped by ALTER TABLE DROP COLUMN.

Syntax

REORG [ TABLE ] table_name { [ WHERE predicate ] APPLY ( PURGE ) |
                             APPLY ( UPGRADE UNIFORM ( ICEBERG_COMPAT_VERSION = version ) } )

For Databricks Runtime versions before 15.4 TABLE is a mandatory keyword.

Note

  • APPLY (PURGE) only rewrites files that contain soft-deleted data.

  • APPLY (UPGRADE) may rewrite all files.

  • REORG TABLE is idempotent, meaning that if it is run twice on the same dataset, the second run has no effect.

  • After running APPLY (PURGE), the soft-deleted data may still exist in the old files. You can run VACUUM to physically delete the old files.

Parameters

  • table_name

    Identifies an existing Delta table. The name must not include a temporal specification or options specification.

  • WHERE predicate

    For APPLY (PURGE), reorganizes the files that match the given partition predicate. Only filters involving partition key attributes are supported.

  • APPLY (PURGE)

    Specifies that the purpose of file rewriting is to purge soft-deleted data. See Purge metadata-only deletes to force data rewrite.

  • APPLY (UPGRADE UNIFORM ( ICEBERG_COMPAT_VERSION = version ))

    Applies to: check marked yes Databricks SQL check marked yes Databricks Runtime 14.3 and above

    Specifies that the purpose of file rewriting is to upgrade the table to the given Iceberg version. version must be either 1 or 2.

Examples

> REORG TABLE events APPLY (PURGE);

> REORG TABLE events WHERE date >= '2022-01-01' APPLY (PURGE);

> REORG TABLE events
    WHERE date >= current_timestamp() - INTERVAL '1' DAY
    APPLY (PURGE);

> REORG TABLE events APPLY (UPGRADE UNIFORM(ICEBERG_COMPAT_VERSION=2));