Skip to main content

Table properties reference

Delta Lake and Apache Iceberg use table properties to control table behavior and features. These properties might have specific meanings and affect behaviors when set.

note

All operations that set or update table properties conflict with other concurrent write operations, causing them to fail. Databricks recommends you modify a table property only when there are no concurrent write operations on the table.

Modify table properties

To modify table properties of existing tables, use SET TBLPROPERTIES.

Delta and Iceberg formats

Delta Lake and Apache Iceberg tables share the same table property names, but require different prefixes:

Delta tables: Use the delta. prefix
Iceberg tables: Use the iceberg. prefix

For example:

To enable deletion vectors on a Delta table: delta.enableDeletionVectors
To enable deletion vectors on an Iceberg table: iceberg.enableDeletionVectors

Table properties and SparkSession properties

Each table has its own table properties that control its behavior. Some SparkSession configurations always override table properties. For example, autoCompact.enabled and optimizeWrite.enabled enable auto compaction and optimized writes at the SparkSession level. Databricks recommends using table-scoped configurations for most workloads.

You can set default values for new tables using SparkSession configurations. These defaults only apply to new tables and don't affect existing table properties. SparkSession configurations use a different prefix than table properties, as shown in the following table:

Table property

SparkSession configuration

delta.<conf>

iceberg.<conf>

spark.databricks.delta.properties.defaults.<conf>

spark.databricks.iceberg.properties.defaults.<conf>

For example, to set the appendOnly = true property for all new tables created in a session, set the following:

SQL
-- For Delta tables
SET spark.databricks.delta.properties.defaults.appendOnly = true

-- For Iceberg tables
SET spark.databricks.iceberg.properties.defaults.appendOnly = true

Table properties

The following table properties are available for both Delta Lake and Apache Iceberg tables. Use the delta. prefix for Delta tables and iceberg. prefix for Iceberg tables.

Property	Description
`autoOptimize.optimizeWrite`	`true` to automatically optimize the layout of the files for this table during writes. See Optimized writes. Data type: `Boolean` Default: (none)
`dataSkippingNumIndexedCols`	The number of columns to collect statistics about for data skipping. A value of `-1` means to collect statistics for all columns. See Data skipping. Data type: `Int` Default: `32`
`dataSkippingStatsColumns`	A comma-separated list of column names on which to collect statistics to enhance data skipping functionality. This property takes precedence over `dataSkippingNumIndexedCols`. See Data skipping. Data type: `String` Default: (none)
`deletedFileRetentionDuration`	The shortest duration to keep logically deleted data files before deleting them physically. This prevents failures in stale readers after compactions or partition overwrites. Databricks recommends the default value of 7 days or higher. If your retention period is too short, long-running jobs might have their uncommitted files deleted before the job completes. See Configure data retention for time travel queries. Data type: `CalendarInterval` Default: `interval 1 week`
`enableDeletionVectors`	`true` to enable deletion vectors and predictive I/O for updates. See Deletion vectors in Databricks and Enable deletion vectors. Data type: `Boolean` Default: Depends on workspace admin settings and Databricks Runtime version. See Auto-enable deletion vectors.
`logRetentionDuration`	How long to keep the history for a table. `VACUUM` operations override this retention threshold. Databricks automatically cleans up log entries older than the retention interval each time a checkpoint is written. Setting this property to a large value retains many log entries. This doesn't impact performance because operations against the log are constant time. Operations on history are parallel but become more expensive as the log size increases. See Configure data retention for time travel queries. Data type: `CalendarInterval` Default: `interval 30 days`
`minReaderVersion` (Delta Lake only)	The minimum required protocol reader version to read from this table. Databricks recommends against manually configuring this property. See Delta Lake feature compatibility and protocols. Data type: `Int` Default: `1`
`minWriterVersion` (Delta Lake only)	The minimum required protocol writer version to write to this table. Databricks recommends against manually configuring this property. See Delta Lake feature compatibility and protocols. Data type: `Int` Default: `2`
`format-version` (Apache Iceberg managed tables only)	The Iceberg table format version. Databricks recommends against manually configuring this property. See Use Apache Iceberg v3 features. Data type: `Int` Default: `2`
`randomizeFilePrefixes`	`true` to generate a random prefix for a file path instead of partition information. Data type: `Boolean` Default: `false`
`targetFileSize`	The target file size in bytes or higher units for file tuning. For example, `104857600` (bytes) or `100mb`. See Control data file size. Data type: `String` Default: (none)
`parquet.compression.codec`	The compression codec for a table. Valid values: `ZSTD`, `SNAPPY`, `GZIP`, `LZ4`, `BROTLI` (support varies by format) This property ensures that all future writes to the table use the chosen codec, overriding the cluster or session default (`spark.sql.parquet.compression.codec`). However, one-off DataFrame `.write.option("compression", "...")` settings still take precedence. Available in Databricks Runtime 16.0 and later. Note that existing files aren't rewritten automatically. To recompress existing data with your chosen format, use `OPTIMIZE table_name FULL`. Data type: `String` Default: `ZSTD`
`appendOnly`	`true` to make the table append-only. Append-only tables don't allow deleting existing records or updating existing values. Data type: `Boolean` Default: `false`
`autoOptimize.autoCompact`	Automatically combines small files within table partitions to reduce small file problems. Accepts `auto` (recommended), `true`, `legacy`, or `false`. See Auto compaction. Data type: `String` Default: (none)
`checkpoint.writeStatsAsJson`	`true` to write file statistics in checkpoints in JSON format for the `stats` column. Data type: `Boolean` Default: `false`
`checkpoint.writeStatsAsStruct`	`true` to write file statistics to checkpoints in struct format for the `stats_parsed` column and to write partition values as a struct for `partitionValues_parsed`. Data type: `Boolean` Default: `true`
`checkpointPolicy`	`classic` for classic checkpoints. `v2` for v2 checkpoints. See Compatibility for tables with liquid clustering. Data type: `String` Default: `classic`
`columnMapping.mode`	Enables column mapping for table columns and the corresponding Parquet columns that use different names. See Rename and drop columns with Delta Lake column mapping. Note: Enabling `columnMapping.mode` automatically enables `randomizeFilePrefixes`. Data type: `DeltaColumnMappingMode` Default: `none`
`compatibility.symlinkFormatManifest.enabled` (Delta Lake only)	`true` to configure the Delta table so that all write operations on the table automatically update the manifests. Data type: `Boolean` Default: `false`
`enableChangeDataFeed`	`true` to enable change data feed. See Enable change data feed. Data type: `Boolean` Default: `false`
`enableTypeWidening`	`true` to enable type widening. See Type widening. Data type: `Boolean` Default: `false`
`isolationLevel`	The degree to which a transaction must be isolated from modifications made by concurrent transactions. Valid values are `Serializable` and `WriteSerializable`. See Isolation levels and write conflicts on Databricks. Data type: `String` Default: `WriteSerializable`
`randomPrefixLength`	The number of characters to generate for random prefixes when `randomizeFilePrefixes` is `true`. Data type: `Int` Default: `2`
`setTransactionRetentionDuration`	The shortest duration within which new snapshots retain transaction identifiers (for example, `SetTransaction`s). New snapshots expire and ignore transaction identifiers older than or equal to the duration specified by this property. The `SetTransaction` identifier is used when making writes idempotent. See Idempotent table writes in `foreachBatch` for details. Data type: `CalendarInterval` Default: (none)
`tuneFileSizesForRewrites`	`true` to always use lower file sizes for all data layout optimization operations on the table. `false` prevents tuning to lower file sizes and disables auto-detection. See Control data file size. Data type: `Boolean` Default: (none)

Modify table properties
Delta and Iceberg formats
Table properties and SparkSession properties
Table properties