Standalone pipelines vs. Lakeflow pipelines

Databricks offers two ways to build materialized views and streaming tables: standalone pipelines, or Lakeflow pipelines. Both run on the same declarative engine and produce Unity Catalog managed tables. The difference is how much of the pipeline you author and operate.

A standalone materialized view or streaming table is a single dataset defined with SQL syntax. Databricks creates and manages a pipeline behind the scenes to refresh it. You create and refresh standalone datasets from a Databricks SQL warehouse, or from a notebook on serverless general compute using spark.sql(). See Standalone pipelines.
A Lakeflow pipeline is a pipeline that you author and operate as a unit. It can contain many datasets, in SQL and Python, with dependency orchestration, lineage, and pipeline-wide operational features. See What are pipelines?.

When you create a standalone materialized view or streaming table, the managed pipeline appears on the Jobs & Pipelines page with a pipeline type of MV/ST. Datasets defined in a Lakeflow pipeline have a pipeline type of ETL.

When to use a standalone pipeline

Use standalone materialized views and streaming tables when:

You accelerate queries or transform data with a single materialized view or streaming table.
You work from a Databricks SQL warehouse, the SQL editor, or a notebook on serverless general compute, and schedule refreshes with SCHEDULE, TRIGGER ON UPDATE, or a SQL task in a job.
You don't need sinks, multi-stage orchestration, or other pipeline-only features.

When to use a Lakeflow pipeline

Use a Lakeflow pipeline when:

You build a multi-stage pipeline with intermediate datasets, where Databricks manages dependencies and lineage across the datasets. Intermediate datasets can be published to the catalog or kept private to the pipeline.
You author tables and flows in Python.
You write to external Delta tables or event streaming destinations using sinks (create_sink() or foreach_batch_sink()).
You apply change data capture from a database snapshot using create_auto_cdc_from_snapshot_flow().
You want triggered or continuous execution across the whole pipeline.

Comparison

Property	Standalone streaming table or materialized view	Pipeline streaming table or materialized view
Authoring interface	SQL syntax, from a Databricks SQL warehouse or with `spark.sql()` in a notebook on serverless general compute	SQL and Python
Scope	One dataset, in a pipeline that Databricks manages for you	Many datasets in one pipeline, with dependency orchestration and lineage
Execution	Triggered, with `SCHEDULE`, `TRIGGER ON UPDATE`, or a SQL task	Triggered or continuous
Pipeline-only features		Sinks, `create_auto_cdc_from_snapshot_flow()`, private datasets
Pipeline type label	`MV/ST`	`ETL`
Move between pipelines	Not supported; recreate the table in the target pipeline	Supported

Property	Standalone streaming table or materialized view	Pipeline streaming table or materialized view
Authoring interface	SQL syntax, from a Databricks SQL warehouse or with `spark.sql()` in a notebook on serverless general compute	SQL and Python
Scope	One dataset, in a pipeline that Databricks manages for you	Many datasets in one pipeline, with dependency orchestration and lineage
Execution	Triggered, with `SCHEDULE`, `TRIGGER ON UPDATE`, or a SQL task	Triggered or continuous
Pipeline-only features		Sinks, `create_auto_cdc_from_snapshot_flow()`, private datasets
Pipeline type label	`MV/ST`	`ETL`
Move between pipelines	Not supported; recreate the table in the target pipeline	Supported

When to use a standalone pipeline​

When to use a Lakeflow pipeline​

Comparison​

When to use a standalone pipeline

When to use a Lakeflow pipeline

Comparison