Delta Lake feature compatibility and protocols
This article provides an overview of Delta Lake protocols, table features, and compatibility with Delta Lake clients for reads and writes.
The transaction log for a Delta table contains protocol versioning information. See Review Delta Lake table details with describe detail.
How does the table protocol specify read and write compatibility?
Every Delta table has a protocol specification which indicates the set of capabilities required to read and write to the table. The protocol specification is used by applications that read from or write to the table to determine if they can handle all the features that the table supports. If an application does not know how to handle a feature that is listed as supported in the protocol of a table, then that application is not be able to read or write that table.
Most new capabilities and functionality added to Delta Lake require upgrading the table protocol.
The following table provides an overview of key terms used to describe Delta Lake protocols:
Term | Description |
---|---|
Delta Lake client | Any system that reads or writes to a Delta table. |
Read protocol | Specifies the support required for a Delta Lake client to read a table. |
Write protocol | Specifies the support required for a Delta Lake client to write to a table. |
| Component of the reader protocol. Valid values are |
| Component of the writer protocol. Valid values are integers 2 through 7. |
Table feature | A fine-grained alternative to protocol versions. Table features map to optionally enabled Delta Lake features. |
Writer feature | A table feature tied to a write protocol. |
Reader feature | A table feature tied to a read protocol. |
Write protocols and writer features only impact compatibility with writer clients, meaning that read-only access to the table from legacy workloads is still supported. Read protocols and reader features impact both read and write compatibility.
Not all Delta Lake features are compatible with each other.
Some table features cannot be dropped once enabled. See Drop a Delta Lake table feature and downgrade table protocol.
Table features for protocol compatibility
In Databricks Runtime 12.2 LTS and above, Databricks uses table features to indicate support for features and compatibility with readers and writers. Table features use granular flags to specify which features are supported by a given table. Table features replace the legacy protocol versioning scheme by introducing new features to the Delta Lake protocol.
Table writer features indicate features that impact the way data is written. Table writer features require minWriterVersion
== 7. Features implemented as writer features do not block reader clients.
Table reader features indicate features that impact the way data is read. All table reader features are also table writer features. Table reader features require minReaderVersion
== 3 and minWriterVersion
== 7. A client cannot write to a table that it cannot read.
When table features are enabled, all features supported by the protocol for the table appear in respective lists as readerFeatures
or writerFeatures
. When dropping features from a table, your table might remove this behavior to resolve to the lowest possible protocol. See Lowest possible protocol.
Integer-based protocol versions and legacy compatibility
All tables include an integer-based protocol version represented by minReaderVersion
and minWriterVersion
. Functionality implemented using table features builds upon these protocol versions, but many legacy reader and writer clients continue to use protocol versions to manage compatibility. Delta Lake attempts to resolve the table protocol to the lowest possible version to maintain maximum compatibility with modern and legacy Delta clients. See Lowest possible protocol.
In the integer based protocol versioning scheme, each version number bundles multiple features, and features across version numbers are cumulative. This means that to be compliant with the Delta protocol, clients must implement support for all reader or writer features present in a given version including all previously released features.
Databricks includes non-breaking partial support for table features in all supported Databricks Runtime versions. OSS Delta clients choose how to implement support for given features.
When does the table protocol change?
The protocol for a table changes under the following conditions:
- If a new feature is enabled, the protocol is upgraded.
- If a table feature is dropped, the protocol is downgraded.
Disabling a table feature does not result in protocol downgrade. You must drop the feature to fully remove it from the table protocol. Not all table features can be dropped. See Drop a Delta Lake table feature and downgrade table protocol.
All protocol change operations conflict with all concurrent writes.
Streaming reads fail when they encounter a commit that changes table metadata. If you want the stream to continue you must restart it. For recommended methods, see Production considerations for Structured Streaming.
Most protocol version upgrades are irreversible, and upgrading the protocol version might break the existing Delta Lake table readers, writers, or both. Databricks recommends you upgrade specific tables only when needed, such as to opt-in to new features in Delta Lake. You should also check to make sure that all of your current and future production tools support Delta Lake tables with the new protocol version.
Protocol downgrades are available for some features. See Drop a Delta Lake table feature and downgrade table protocol.
When is the table protocol upgraded?
When you enable a feature on a table, the table protocol is automatically upgraded. Some features are enabled automatically based on the syntax used in CREATE
or ALTER
table statements, while other features require explicit enablement through setting table properties. Sometimes you must explicitly enable multiple tables features in order to support desired functionality. In other cases, enabling functionality might automatically enable other table features. See the Databricks documentation for the functionality and syntax you're using to determine which table features are required.
Reader features require upgrading both the read protocol and the write protocol. Writer features only require upgrading the write protocol.
As an example, support for CHECK
constraints is a writer feature: only writing applications need to know about CHECK
constraints and enforce them.
In contrast, column mapping requires upgrading both the read and write protocols. Because the data is stored differently in the table, reader applications must understand column mapping so they can read the data correctly.
Databricks recommends against changing the minReaderVersion
and minWriterVersion
table properties. Changing these table properties does not prevent protocol upgrade. Setting these values to a lower value does not downgrade the table. See Drop a Delta Lake table feature and downgrade table protocol.
Lowest possible protocol
By default, Delta Lake attempts to use the lowest protocol possible to represent all features marked as supported by the table.
This behavior can only result in lowering the table protocol, meaning that the minReaderVersion
or minWriterVersion
might change to lower values for a table.
You must run the DROP FEATURE
command to remove a table feature from the list of supported feature in the table protocol. Table features are never dropped automatically.
If all Delta Lake features present in a table are fully supported in a lower protocol version, the table might revert to a protocol version that doesn't use table features to indicate reader and writer compatibility. When this protocol downgrade occurs, the table might drop either the readerFeatures
or both readerFeatures
and writerFeatures
from the table protocol. This doesn't result in any Delta Lake features being disabled and only occurs when table features aren't required in the table protocol.
All changes that lower the table protocol increase compatibility with reader and writer clients. This is because reader and writer clients must respect lower protocol versions even if they support higher protocol versions.
Do table features change how Delta Lake features are enabled?
If you only interact with Delta tables through Databricks, you can continue to track support for Delta Lake features using minimum Databricks Runtime requirements. Databricks supports reading Delta tables that have been upgraded to table features in all Databricks Runtime LTS releases, as long as all features used by the table are supported by that release.
If you read and write from Delta tables using other systems, you might need to consider how table features impact compatibility, because there is a risk that the system could not understand the upgraded protocol versions.
Table features are introduced to the Delta Lake format for writer version 7 and reader version 3. Databricks has backported code to all supported Databricks Runtime LTS versions to add support for table features, but only for those features already supported in that Databricks Runtime. This means that while you can opt in to using table features to enable generated columns and still work with these tables in Databricks Runtime 9.1 LTS, tables with identity columns enabled (which requires Databricks Runtime 10.4 LTS) are still not supported in that Databricks Runtime.
How does Databricks manage Delta Lake feature compatibility?
Databricks introduces support for new Delta Lake features and optimizations that build on top of Delta Lake in Databricks Runtime releases. Databricks optimizations that leverage Delta Lake features respect the protocols used in OSS Delta Lake for compatibility. Many Databricks optimizations require enabling Delta Lake features on a table, and some Databricks products such as DLT depend on many table features.
- All tables written by lower Databricks Runtime versions have full read and write support in higher Databricks Runtime versions.
- Tables written by higher Databricks Runtime versions might use table features that are not supported in lower Databricks Runtime versions.
- Some features might allow writes from lower Databricks Runtime versions without fully applying all optimizations related to the enabled table features.
When working with table features that have backported support to lower Databricks Runtime versions, some operations that run on a given Databricks Runtime version might not run on the corresponding OSS Delta version. If your development cycle or data architecture includes OSS Delta Lake, you should always test compatibility in OSS Delta clients before enabling table features on production tables.
Delta Lake features and required Databricks Runtime versions
Features are enabled on a table-by-table basis. The following table lists the lowest Databricks Runtime version with full support for the indicated feature. Full support means that all generally available functionality for both reads and writes are supported.
Feature | Requires Databricks Runtime version or later | Documentation |
---|---|---|
| All supported Databricks Runtime versions | |
Change data feed | All supported Databricks Runtime versions | |
Generated columns | All supported Databricks Runtime versions | |
Column mapping | All supported Databricks Runtime versions | |
Identity columns | All supported Databricks Runtime versions | |
Table features | Databricks Runtime 12.2 LTS | |
Deletion vectors | Databricks Runtime 12.2 LTS | |
TimestampNTZ | Databricks Runtime 13.3 LTS | |
UniForm | Databricks Runtime 13.3 LTS | |
Liquid clustering | Databricks Runtime 13.3 LTS | |
Row tracking | Databricks Runtime 14.3 LTS | |
Type widening | Databricks Runtime 15.4 LTS | |
Variant | Databricks Runtime 15.4 LTS | |
Collations | Databricks Runtime 16.1 | |
Protected checkpoints | Databricks Runtime 16.3 | Drop a Delta Lake table feature and downgrade table protocol |
See Databricks Runtime release notes versions and compatibility.
DLT and Databricks SQL automatically upgrade runtime environments with regular releases to support new features. See DLT release notes and the release upgrade process and Databricks SQL release notes.
Features by protocol version
The OSS Delta Lake protocol has standardized on table features, but some reader and writer clients have not implemented support for table features and continue to use legacy minWriterVersion
and minReaderVersion
protocols.
Some clients might not have support for all Delta Lake features, including features that use legacy protocol versioning. Consult documentation for your Delta Lake client to confirm support for features. Always test compatibility before enabling new features on production tables.
The following table shows minimum reader and writer protocol versions required for Delta Lake features, as well as indicating whether a table feature needs to be respected for writes only or both reads and writes.
If you are only concerned with Databricks Runtime compatibility, see How does Databricks manage Delta Lake feature compatibility?.
Feature |
|
| Table feature |
---|---|---|---|
2 | 1 | Writer | |
3 | 1 | Writer | |
4 | 1 | Writer | |
4 | 1 | Writer | |
5 | 2 | Reader and writer | |
6 | 1 | Writer | |
7 | 1 | Writer | |
7 | 3 | Reader and writer | |
7 | 3 | Reader and writer | |
7 | 3 | Reader and writer | |
7 | 2 | Writer (1) | |
7 | 3 | Reader and writer | |
7 | 3 | Reader and writer | |
7 | 3 | Reader and writer | |
7 | 1 | Writer |
(1): Requires column mapping is enabled.