Manage configuration of Delta Live Tables pipelines
Because Delta Live Tables automates operational complexities such as infrastructure management, task orchestration, error recovery, and performance optimization, many of your pipelines can run with minimal manual configuration. However, Delta Live Tables also allows you to manage configuration for pipelines requiring non-default configurations or to optimize performance and resource usage. These articles provide details on managing configurations for your Delta Live Tables pipelines, including settings that determine how pipelines are run, options for the compute that runs a pipeline, and management of external dependencies such as Python libraries.
Use serverless compute to run fully managed pipelines
Use serverless DLT pipelines to run pipelines with reliable and fully managed compute resources. With serverless compute, the compute that runs your pipeline is automatically optimized and scaled up and down based on the resources required to run a pipeline. Serverless DLT pipelines supports additional features to improve performance, such as incremental refresh for materialized views, faster startup time for compute resources, and improved processing of streaming workloads. See Create fully managed pipelines using Delta Live Tables with serverless compute.
Manage pipeline settings
The configuration for a Delta Live Tables pipeline includes settings that define the source code implementing the pipeline. It also includes settings that control pipeline infrastructure, dependency management, how updates are processed, and how tables are saved in the workspace. Most configurations are optional, but some require careful attention.
To learn about the configuration options for pipelines and how to use them, see Configure pipeline settings for Delta Live Tables.
For detailed specifications of Delta Live Tables settings, properties that control how tables are managed, and non-settable compute options, see Delta Live Tables properties reference.
Manage external dependencies for pipelines that use Python
Delta Live Tables supports using external dependencies in your pipelines such as Python packages and libraries. To learn about options and recommendations for using dependencies, see Manage Python dependencies for Delta Live Tables pipelines.
Use Python modules stored in your Databricks workspace
In addition to implementing your Python code in Databricks notebooks, you can use Databricks Git Folders or workspace files to store your code as Python modules. Storing your code as Python modules is especially useful when you have common functionality you want to use in multiple pipelines or notebooks in the same pipeline. To learn how to use Python modules with your pipelines, see Import Python modules from Git folders or workspace files.
Optimize pipeline compute utilization
Use Enhanced Autoscaling to optimize the cluster utilization of your pipelines. Enhanced Autoscaling adds resources only if the system determines those resources will increase pipeline processing speed. Resources are freed when no longer needed, and clusters are shut down as soon as all pipeline updates are complete.
To learn more about Enhanced Autoscaling, including configuration details, see Optimize the cluster utilization of Delta Live Tables pipelines with Enhanced Autoscaling.