Delta table properties reference
Delta Lake reserves Delta table properties starting with delta.
. These properties may have specific meanings, and affect behaviors when these
properties are set.
Default table properties
Delta Lake configurations set in the SparkSession override the default table properties for new Delta Lake tables created in the session. The prefix used in the SparkSession is different from the configurations used in the table properties.
Delta Lake conf |
SparkSession conf |
---|---|
|
|
For example, to set the delta.appendOnly = true
property for all new Delta Lake tables created in a session, set the following:
SET spark.databricks.delta.properties.defaults.appendOnly = true
To modify table properties of existing tables, use SET TBLPROPERTIES.
Delta table properties
Available Delta table properties include the following:
Property |
---|
See Delta table properties reference. Data type: Default: |
See Enable auto optimize. Data type: Default: (none) |
See Enable auto optimize. Data type: Default: (none) |
See Manage column-level statistics in checkpoints. Data type: Default: |
See Manage column-level statistics in checkpoints. Data type: Default: (none) |
Whether column mapping is enabled for Delta table columns and the corresponding Parquet columns that use different names. See Rename and drop columns with Delta Lake column mapping. Data type: Default: |
Data type: Default: |
The number of columns for Delta Lake to collect statistics
about for data skipping. A value of See Data skipping with Z-order indexes for Delta Lake. Data type: Default: |
The shortest duration for Delta Lake to keep logically deleted data files before deleting them physically. This is to prevent failures in stale readers after compactions or partition overwrites. This value should be large enough to ensure that:
See Configure data retention for time travel. Data type: Default: |
Data type: Default: |
The degree to which a transaction must be isolated from modifications made by concurrent transactions. Valid values are See Isolation levels and write conflicts on Databricks. Data type: Default: |
How long the history for a Delta table is kept. Each time a checkpoint is written, Delta Lake automatically cleans up log entries older than the retention interval. If you set this property to a large enough value, many log entries are retained. This should not impact performance as operations against the log are constant time. Operations on history are parallel but will become more expensive as the log size increases. See Configure data retention for time travel. Data type: Default: |
The minimum required protocol reader version for a reader that allows to read from this Delta table. See How does Databricks manage Delta Lake feature compatibility?. Data type: Default: |
The minimum required protocol writer version for a writer that allows to write to this Delta table. See How does Databricks manage Delta Lake feature compatibility?. Data type: Default: |
For example, this may improve Amazon S3 performance when Delta Lake needs to send very high volumes of Amazon S3 calls to better partition across S3 servers. See Delta table properties reference. Data type: Default: |
When See Delta table properties reference. Data type: Default: |
The shortest duration within which new snapshots will retain transaction identifiers
(for example, Data type: Default: (none) |
The target file size in bytes or higher units for file tuning. For example,
See Configure Delta Lake to control data file size. Data type: Default: (none) |
See Configure Delta Lake to control data file size. Data type: Default: (none) |