Delta Live Tables supports external dependencies in your pipelines. Databricks recommends using one of two patterns to install Python packages:
%pip installcommand to install packages for all source files in a pipeline.
Import modules or libraries from source code stored in workspace files. See Import Python modules from workspace files.
Delta Live Tables also supports using global and cluster-scoped init scripts. However, these external dependencies, particularly init scripts, increase the risk of issues with runtime upgrades. To mitigate these risks, minimize using init scripts in your pipelines. If your processing requires init scripts, automate testing of your pipeline to detect problems early. If you use init scripts, Databricks recommends increasing your testing frequency.
To specify external Python libraries, use the
%pip install magic command. When an update starts, Delta Live Tables runs all cells containing a
%pip install command before running any table definitions. Every Python notebook included in the pipeline has access to all installed libraries. The following example installs the
numpy library and makes it globally available to any Python notebook in the pipeline:
%pip install numpy import numpy as np
To install a Python wheel package, add the wheel path to the
%pip install command. Installed Python wheel packages are available to all tables in the pipeline. The following example installs a wheel named
dltfns-1.0-py3-none-any.whl from the DBFS directory
%pip install /dbfs/dlt/dltfns-1.0-py3-none-any.whl