Build pipelines

Build Lakeflow pipelines by loading and transforming data, applying data quality checks, and writing results to your target tables. The following topics cover the tasks involved in building and running pipelines.

To learn the declarative concepts behind pipelines (datasets, flows, and the pipeline graph), see What are Lakeflow pipelines?. For a step-by-step walkthrough, see Tutorial: Build an ETL pipeline using change data capture.

Topic	Description
Develop in the Lakeflow Pipelines Editor	Author, run, and debug pipelines in the editor, with a pipeline graph, data previews, and selective execution.
Use Genie Code for pipeline development	Generate, edit, and debug pipeline code from a single prompt with Genie Code Agent mode in the editor.
Manage identities and privileges	Control the identity that runs a pipeline and who can create, run, refresh, and view pipelines and their output.
Load data	Ingest data into your pipeline from cloud object storage and streaming message buses.
Transform data	Apply transformations, joins, and aggregations to build derived datasets.
Full refresh for streaming tables	Reprocess all source data to rebuild a streaming table.
Data quality	Validate records with expectations and control what happens when a record fails.
Write datasets	Write pipeline results to sinks such as Apache Kafka and Azure Event Hubs, and use flows to write to streaming targets.

Topic	Description
Develop in the Lakeflow Pipelines Editor	Author, run, and debug pipelines in the editor, with a pipeline graph, data previews, and selective execution.
Use Genie Code for pipeline development	Generate, edit, and debug pipeline code from a single prompt with Genie Code Agent mode in the editor.
Manage identities and privileges	Control the identity that runs a pipeline and who can create, run, refresh, and view pipelines and their output.
Load data	Ingest data into your pipeline from cloud object storage and streaming message buses.
Transform data	Apply transformations, joins, and aggregations to build derived datasets.
Full refresh for streaming tables	Reprocess all source data to rebuild a streaming table.
Data quality	Validate records with expectations and control what happens when a record fails.
Write datasets	Write pipeline results to sinks such as Apache Kafka and Azure Event Hubs, and use flows to write to streaming targets.

Additional resources​

Additional resources