Enrich Delta Lake tables with custom metadata
Databricks recommends always providing comments for tables and columns in tables. You can generate these comments using AI. See Add AI-generated comments to a table.
Unity Catalog also provides the ability to tag data. See Apply tags.
You can also log messages for individual commits to tables in a field in the Delta Lake transaction log.
Set user-defined commit metadata
You can specify user-defined strings as metadata in commits, either using the DataFrameWriter option userMetadata
or the SparkSession configuration spark.databricks.delta.commitInfo.userMetadata
. If both of them have been specified, then the option takes preference. This user-defined metadata is readable in the DESCRIBE HISTORY
operation. See Work with Delta Lake table history.
SET spark.databricks.delta.commitInfo.userMetadata=overwritten-for-fixing-incorrect-data
INSERT OVERWRITE default.people10m SELECT * FROM morePeople
df.write.format("delta") \
.mode("overwrite") \
.option("userMetadata", "overwritten-for-fixing-incorrect-data") \
.save("/tmp/delta/people10m")
df.write.format("delta")
.mode("overwrite")
.option("userMetadata", "overwritten-for-fixing-incorrect-data")
.save("/tmp/delta/people10m")