Pipeline developer reference
This section contains reference and instructions for pipeline developers.
Data loading and transformations are implemented in pipelines by queries that define streaming tables and materialized views. To implement these queries, Lakeflow Spark Declarative Pipelines supports SQL and Python interfaces. Because these interfaces provide equivalent functionality for most data processing use cases, pipeline developers can choose the interface that they are most comfortable with.
Python development
Create pipelines using Python code.
Topic | Description |
|---|---|
An overview of developing pipelines in Python. | |
Lakeflow Spark Declarative Pipelines Python language reference | Python reference documentation for the |
Instructions for managing Python libraries in pipelines. | |
Instructions for using Python modules that you have stored in Databricks. |
SQL development
Create pipelines using SQL code.
Topic | Description |
|---|---|
An overview of developing pipelines in SQL. | |
Reference documentation for SQL syntax for Lakeflow Spark Declarative Pipelines. | |
Use Databricks SQL to work with pipelines. |
Other development topics
The following topics describe other ways to develop piplines.
Topic | Description |
|---|---|
Convert an existing pipeline to a bundle, which allows you to manage your data processing configuration in a source-controlled YAML file for easier maintenance and automated deployments to target environments. | |
Use the open source | |
An overview of options for developing pipelines locally. |