Rename and drop columns with Delta Lake column mapping

Delta Lake column mapping enables metadata-only changes to rename or drop columns without rewriting data files. Column mapping also allows characters not supported by Parquet in column names, such as spaces, so you can directly ingest CSV or JSON data into Delta Lake without renaming columns.

Prerequisites

Before enabling column mapping, review the Limitations.

Column mapping requires the following:

Delta protocols: reader version 2 or above, and writer version 5 or above. See Delta Lake feature compatibility and protocols.
Reading tables with column mapping enabled requires Databricks Runtime 10.4 LTS and above

Enable column mapping

Use the following command to enable column mapping with id mode on a new table:

SQL
CREATE table <table-name> (
  id INT,
  name STRING
)
USING DELTA
TBLPROPERTIES (
  'delta.columnMapping.mode' = 'id'
);

Use the following command to enable column mapping with name mode on an existing table:

SQL
ALTER TABLE <table-name> SET TBLPROPERTIES (
  'delta.columnMapping.mode' = 'name'
)

For details on column mapping modes, see Column mapping modes.

Rename a column

When column mapping is enabled for a Delta Lake table, you can rename a column:

SQL
ALTER TABLE <table-name> RENAME COLUMN old_col_name TO new_col_name

For more examples, see Update table schemas with schema evolution.

Drop columns

When column mapping is enabled for a Delta Lake table, you can drop one or more columns:

SQL
ALTER TABLE table_name DROP COLUMN col_name
ALTER TABLE table_name DROP COLUMNS (col_name_1, col_name_2, ...)

For more details, see Update table schemas with schema evolution.

Column mapping modes

The delta.columnMapping.mode table property enables metadata-only changes to mark columns as deleted or renamed without rewriting data files. The following modes are available:

none (default): Column mapping is not enabled. Column names are subject to Parquet naming constraints.
name: This enables metadata-only column renames and drops, and allows special characters in column names. name mode may be set on new and existing tables.
id: This enables metadata-only column renames and drops, and allows special characters in column names. id mode must be set on table creation and cannot be set on existing tables.

note

Databricks recommends id mode for most use cases for compatibility purposes. However, name mode is automatically set if you do not provide a value for delta.columnMapping.mode and you enable Iceberg compatibility features such as UniForm.

Supported characters in column names

When column mapping is enabled for a Delta Lake table, you can include spaces and any of these characters in column names: ,;{}()\n\t=.

Remove column mapping

warning

Removing column mapping rewrites all data files to replace physical column names with logical names. This operation doesn't support row-level or physical conflict resolution.

Concurrent write operations will cause a ConcurrentModificationException.

Before removing column mapping:

Pause all concurrent write operations, including streaming jobs and ETL pipelines.
Turn off predictive optimization on the table.
For large tables, schedule this operation during low-activity periods.

You can remove column mapping from a table using the following command:

SQL
ALTER TABLE <table-name> SET TBLPROPERTIES ('delta.columnMapping.mode' = 'none')

To remove column mapping and downgrade the table protocol, see Maintain compatibility.

Maintain compatibility

In Databricks Runtime 15.4 LTS and above, you can use the DROP FEATURE command to remove column mapping and downgrade the table protocol, maintaining compatibility with readers using Databricks Runtime 10.3 and below.

important

Dropping column mapping from a table doesn't remove the random prefixes used in directory names for partitioned tables.

See Drop a Delta Lake table feature and downgrade table protocol.

Column mapping and streaming

Non-additive schema changes, such as column renames or drops, can break streaming reads. Use a schemaTrackingLocation to allow Delta Lake to track schema changes and prevent stream failures.

Configuration

When configuring schemaTrackingLocation:

Each streaming read from a source table must have its own schemaTrackingLocation.
The schemaTrackingLocation must be within the checkpointLocation directory of the streaming write target.
For workloads that read from multiple source tables, specify unique subdirectories within checkpointLocation for each source.

For a complete list of streaming Delta Lake options, see Delta Lake.

Enable column mapping on an active stream

To enable column mapping on an active streaming job:

Stop the stream
Enable column mapping on the table
Restart the stream (first restart - initializes column mapping)
Restart the stream again (second restart - enables schema changes)

Any further schema changes (adding or dropping columns, or changing column types) require you to restart the stream.

Example

To specify a schemaTrackingLocation for a streaming read from a Delta Lake table with column mapping, use the following example:

Python
checkpoint_path = "/path/to/checkpointLocation"

(spark.readStream
  .option("schemaTrackingLocation", checkpoint_path)
  .table("delta_source_table")
  .writeStream
  .option("checkpointLocation", checkpoint_path)
  .toTable("output_table")
)

Limitations

Enabling column mapping might break the following:

Legacy workloads that rely on directory names for reading Delta Lake tables. Partitioned tables with column mapping use random prefixes instead of column names for partition directories. See Delta Lake and Parquet partitioning compatibility.
Downstream operations using change data feed. See Tables with column mapping.
Streaming reads from the Delta Lake table, including in Lakeflow pipelines. See Column mapping and streaming.

Prerequisites​

Enable column mapping​

Rename a column​

Drop columns​

Column mapping modes​

Supported characters in column names​

Remove column mapping​

Maintain compatibility​

Column mapping and streaming​

Configuration​

Enable column mapping on an active stream​

Example​

Limitations​

Additional resources​