Create fully managed pipelines using Delta Live Tables with serverless compute


Serverless DLT pipelines is in Public Preview. To learn about enabling serverless DLT pipelines, contact your Databricks account team.

This article explains how to use Delta Live Tables with serverless compute to run your pipeline updates with fully managed compute, and details serverless compute features that improve the performance of your pipelines.

Use serverless DLT pipelines to run your Delta Live Tables pipelines without configuring and deploying infrastructure. With serverless DLT pipelines, you focus on implementing your data ingestion and transformation, and Databricks efficiently manages compute resources, including optimizing and scaling compute for your workloads. Serverless DLT pipelines includes the following capabilities:

  • Automatically optimized compute that runs only when needed.

  • Reliable and fully managed compute resources.

  • More efficient dataset updates with incremental refresh for materialized views.

  • Faster startup for the compute resources that run a pipeline update.

Serverless DLT pipelines also has the following features to optimize the processing performance of pipelines, support more efficient usage of compute resources, and help lower the cost of running your pipeline:

  • Stream pipelining: To improve utilization, throughput, and latency for streaming data workloads such as data ingestion, with serverless DLT pipelines, microbatches are pipelined. In other words, instead of running microbatches sequentially like standard Spark Structured Streaming, serverless DLT pipelines runs microbatches concurrently, leading to better compute resource utilization. Stream pipelining is enabled by default in serverless DLT pipelines.

  • Vertical autoscaling: serverless DLT pipelines adds to the horizontal autoscaling provided by Databricks Enhanced Autoscaling by automatically allocating the most cost-efficient instance types that can run your Delta Live Tables pipeline without failing because of out-of-memory errors. See What is vertical autoscaling?

Because cluster creation permission is not required, all workspace users can use serverless DLT pipelines to run their workflows.


  • To use serverless DLT pipelines your workspace must have Unity Catalog enabled.

Run a pipeline update with serverless DLT pipelines


Because compute resources are fully managed for serverless DLT pipelines, compute settings are unavailable in the Delta Live Tables UI for a serverless pipeline. When you enable serverless, any compute settings you have configured for a pipeline are removed. If you switch a pipeline back to non-serverless updates, these compute settings must be re-added to the pipeline configuration. You also cannot manually add compute settings in a clusters object in the JSON configuration for the pipeline.

To run a pipeline update that uses serverless DLT pipelines, select the Serverless checkbox when you create or edit a pipeline.

How are materialized views refreshed in serverless DLT pipelines?

When possible, query results are updated incrementally for materialized views in a serverless pipeline. When an incremental refresh is performed, the results are equivalent to a full recomputation. If the materialized view cannot be incrementally refreshed, the refresh process uses a full refresh instead. See Refresh operations for materialized views.

What is vertical autoscaling?

Serverless DLT pipelines vertical autoscaling automatically allocates the most cost-efficient available instance types to run your Delta Live Tables pipeline updates without failing because of out-of-memory errors. Vertical autoscaling scales up when larger instance types are required to run a pipeline update and also scales down when it determines that the update can be run with smaller instance types. Vertical autoscaling determines whether driver nodes, worker nodes, or both driver and worker nodes should be scaled up or down.

Vertical autoscaling is used for all serverless DLT pipelines, including pipelines used by Databricks SQL materialized views and streaming tables.

Vertical autoscaling works by detecting pipeline updates that have failed because of out-of-memory errors. When these failures are detected, vertical autoscaling allocates larger instance types based on the out-of-memory data collected from the failed update. In production mode, a new update that uses the new compute resources is started automatically. In development mode, the new compute resources are used when you manually start a new update.

If vertical autoscaling detects that the memory of the allocated instances is consistently underutilized, it will scale down the instance types to use in the next pipeline update.