• Databricks
  • Databricks
  • Support
  • Feedback
  • Try Databricks
  • Help Center
  • Documentation
  • Knowledge Base
Documentation for Databricks on AWS
  • Getting started with Databricks
  • Databricks SQL Analytics guide
  • Databricks Workspace guide
    • Get started with Databricks Workspace
    • Best practices
    • Language roadmaps
    • User guide
    • Data guide
    • Delta Lake and Delta Engine guide
      • Introduction
      • Delta Lake quickstart
      • Introductory notebooks
      • Ingest data into Delta Lake
      • Table batch reads and writes
      • Table streaming reads and writes
      • Table deletes, updates, and merges
      • Table utility commands
      • Constraints
      • Table versioning
      • Delta Lake API reference
      • Concurrency control
      • Integrations
      • Migration guide
      • Best practices: Delta Lake
      • Frequently asked questions (FAQ)
      • Delta Lake resources
      • Delta Engine
        • Optimize performance with file management
        • Auto Optimize
        • Optimize performance with caching
        • Dynamic file pruning
        • Isolation levels
        • Bloom filter indexes
        • Optimize join performance
        • Optimized data transformation
    • Machine learning and deep learning guide
    • MLflow guide
    • Genomics guide
    • Administration guide
    • API reference
    • Release notes
  • Resources

Updated Apr 15, 2021

Send us feedback

  • Documentation
  • Databricks Workspace guide
  • Delta Lake and Delta Engine guide
  • Delta Engine

Delta Engine

Delta Engine is a high performance, Apache Spark compatible query engine that provides an efficient way to process data in data lakes including data stored in open source Delta Lake. Delta Engine optimizations accelerate data lake operations, supporting a variety of workloads ranging from large-scale ETL processing to ad-hoc, interactive queries. Many of these optimizations take place automatically; you get the benefits of these Delta Engine capabilities just by using Databricks for your data lakes.

  • Optimize performance with file management
    • Compaction (bin-packing)
    • Data skipping
    • Z-Ordering (multi-dimensional clustering)
    • Notebooks
    • Improve interactive query performance
    • Frequently asked questions (FAQ)
  • Auto Optimize
    • How Auto Optimize works
    • Usage
    • When to opt in and opt out
    • Example workflow: Streaming ingest with concurrent deletes or updates
    • Frequently asked questions (FAQ)
  • Optimize performance with caching
    • Delta and Apache Spark caching
    • Delta cache consistency
    • Use Delta caching
    • Cache a subset of the data
    • Monitor the Delta cache
    • Configure the Delta cache
  • Dynamic file pruning
  • Isolation levels
    • Set the isolation level
  • Bloom filter indexes
    • Configuration
    • Create a Bloom filter index
    • Drop a Bloom filter index
    • Display the list of Bloom filter indexes
    • Notebook
  • Optimize join performance
    • Range join optimization
    • Skew join optimization
  • Optimized data transformation
    • Higher-order functions
    • Transform complex data types


© Databricks 2021. All rights reserved. Apache, Apache Spark, Spark, and the Spark logo are trademarks of the Apache Software Foundation.

Send us feedback | Privacy Policy | Terms of Use