Delta Lake guide
Delta Lake is an open source storage layer that brings reliability to data lakes. Delta Lake provides ACID transactions, scalable metadata handling, and unifies streaming and batch data processing. Delta Lake runs on top of your existing data lake and is fully compatible with Apache Spark APIs. Delta Lake on Databricks allows you to configure Delta Lake based on your workload patterns.
Databricks adds optimized layouts and indexes to Delta Lake for fast interactive queries.
This guide provides an introductory overview, quickstarts, and guidance for using Delta Lake on Databricks.
- Introduction
- Delta Lake quickstart
- Introductory notebooks
- Ingest data into Delta Lake
- Table batch reads and writes
- Table streaming reads and writes
- Table deletes, updates, and merges
- Change data feed
- Table utility commands
- Constraints
- Table protocol versioning
- Delta column mapping
- Unity Catalog
- Delta Lake APIs
- Concurrency control
- Access Delta tables from external data processing engines
- Migration guide
- Best practices: Delta Lake
- Frequently asked questions (FAQ)
- Delta Lake resources
- Optimizations
- Delta table properties reference