Databricks Asset Bundles library dependencies

This article describes the syntax for declaring Databricks Asset Bundles library dependencies. Bundles enable programmatic management of Databricks workflows. See What are Databricks Asset Bundles?.

In addition to notebooks and source files, your Databricks jobs will likely depend on libraries in order to work as expected. Library dependencies are declared in your bundle configuration files and are often necessary as part of the job task type specification.

Bundles provide support for the following library dependencies:

Python wheel file
JAR file (Java or Scala)
PyPI, Maven, or CRAN packages

For Python, you can also specify job task dependencies in a requirements.txt file and include that in your bundle. See Python requirements.txt.

note

Whether a library is supported depends on the cluster configuration and the library source. For complete library support information, see Install libraries.

Python wheel file

To add a Python wheel file to a job task, in libraries specify a whl mapping for each library to be installed. You can install a wheel file from workspace files, Unity Catalog volumes, cloud object storage, or a local file path.

important

Libraries can be installed from DBFS when using Databricks Runtime 14.3 LTS and below. However, any workspace user can modify library files stored in DBFS. To improve the security of libraries in a Databricks workspace, storing library files in the DBFS root is deprecated and disabled by default in Databricks Runtime 15.1 and above. See Storing libraries in DBFS root is deprecated and disabled by default.

Instead, Databricks recommends uploading all libraries, including Python libraries, JAR files, and Spark connectors, to workspace files or Unity Catalog volumes, or using library package repositories. If your workload does not support these patterns, you can also use libraries stored in cloud object storage.

The following example shows how to install three Python wheel files for a job task.

The first Python wheel file was either previously uploaded to the Databricks workspace or added as an include item in the sync mapping, and is in the same local folder as the bundle configuration file.
The second Python wheel file is in the specified workspace files location in the Databricks workspace.
The third Python wheel file was previously uploaded to the volume named my-volume in the Databricks workspace.

YAML
resources:
  jobs:
    my_job:
      # ...
      tasks:
        - task_key: my_task
          # ...
          libraries:
            - whl: ./my-wheel-0.1.0.whl
            - whl: /Workspace/Shared/Libraries/my-wheel-0.0.1-py3-none-any.whl
            - whl: /Volumes/main/default/my-volume/my-wheel-0.1.0.whl

JAR file (Java or Scala)

To add a JAR file to a job task, in libraries specify a jar mapping for each library to be installed. You can install a JAR from Unity Catalog volumes, cloud object storage, or a local file path.

important

The following example shows how to install a JAR file that was previously uploaded to the volume named my-volume in the Databricks workspace.

YAML
resources:
  jobs:
    my_job:
      # ...
      tasks:
        - task_key: my_task
          # ...
          libraries:
            - jar: /Volumes/main/default/my-volume/my-java-library-1.0.jar

For example configuration that builds and deploys the JAR, see Bundle that uploads a JAR file to Unity Catalog. For a tutorial that creates a bundle project that builds and deploys a Scala JAR, see Build a Scala JAR using Databricks Asset Bundles.

PyPI package

To add a PyPI package to a job task definition, in libraries, specify a pypi mapping for each PyPI package to be installed. For each mapping, specify the following:

For package, specify the name of the PyPI package to install. An optional exact version specification is also supported.
Optionally, for repo, specify the repository where the PyPI package can be found. If not specified, the default pip index is used (https://pypi.org/simple/).

The following example shows how to install two PyPI packages.

The first PyPI package uses the specified package version and the default pip index.
The second PyPI package uses the specified package version and the explicitly specified pip index.

YAML
resources:
  jobs:
    my_job:
      # ...
      tasks:
        - task_key: my_task
          # ...
          libraries:
            - pypi:
                package: wheel==0.41.2
            - pypi:
                package: numpy==1.25.2
                repo: https://pypi.org/simple/

Maven package

To add a Maven package to a job task definition, in libraries, specify a maven mapping for each Maven package to be installed. For each mapping, specify the following:

For coordinates, specify the Gradle-style Maven coordinates for the package.
Optionally, for repo, specify the Maven repo to install the Maven package from. If omitted, both the Maven Central Repository and the Spark Packages Repository are searched.
Optionally, for exclusions, specify any dependencies to explicitly exclude. See Maven dependency exclusions.

The following example shows how to install two Maven packages.

The first Maven package uses the specified package coordinates and searches for this package in both the Maven Central Repository and the Spark Packages Repository.
The second Maven package uses the specified package coordinates, searches for this package only in the Maven Central Repository, and does not include any of this package's dependencies that match the specified pattern.

YAML
resources:
  jobs:
    my_job:
      # ...
      tasks:
        - task_key: my_task
          # ...
          libraries:
            - maven:
                coordinates: com.databricks:databricks-sdk-java:0.8.1
            - maven:
                coordinates: com.databricks:databricks-dbutils-scala_2.13:0.1.4
                repo: https://mvnrepository.com/
                exclusions:
                  - org.scala-lang:scala-library:2.13.0-RC*

Python requirements.txt

Python library dependencies can also be specified in a requirements*.txt file that is included as part of the job task definition. The path to the file can be a local path, a workspace path, or Unity Catalog volume path.

YAML
resources:
  jobs:
    my_job:
      # ...
      tasks:
        - task_key: my_task
          # ...
          libraries:
            - requirements: ./local/path/requirements.txt

Python wheel file​

JAR file (Java or Scala)​

PyPI package​

Maven package​

Python requirements.txt​

Python wheel file

JAR file (Java or Scala)

PyPI package

Maven package

Python requirements.txt