Compute compatibility with libraries and init scripts

This article provides an overview of compatibility between Databricks Runtime versions, access modes, locations for storing init scripts and libraries, and types of libraries.

All init scripts mentioned in this article are cluster-scoped init scripts. All libraries mentioned in this article are installed as cluster-scoped libraries.

Python and R libraries can optionally be installed at the notebook level. See Notebook-scoped Python libraries or Notebook-scoped R libraries.

Important

Init scripts support for referencing other files depends on file location. See What files can I reference in an init script?.

Best practices

Databricks recommends the following best practices:

  • Manage init scripts using compute policies rather than global init scripts.

  • Use shared access mode for all workloads. Only use the single user access mode if required functionality is not supported by shared access mode.

  • Use recent Databricks Runtime versions for all workloads.

  • Manage library installation for production and interactive environments using compute policies. Don’t install libraries using init scripts.

In single user access mode, the identity of the assigned principal (a user or service principal) is used.

In shared access mode:

  • Libraries use the identity of the library installer.

  • Init scripts use the identity of the cluster owner.

Note

No-isolation shared access mode does not support volumes, but uses the same identity assignment as shared access mode.

Where can libraries be stored?

You can install libraries on compute from the following locations:

  • Workspace files

  • Unity Catalog volumes

  • Cloud object storage

  • Package repositories

Not all locations are supported for all types of libraries or all compute configurations.

Where can init scripts be stored?

You can store and configure init scripts from the following locations:

  • Workspace files

  • Unity Catalog volumes

  • Cloud object storage

Not all locations are supported for all compute configurations.

Shared access mode support for libraries and init scripts

Databricks recommends using shared access mode for all workloads. When scheduling workflows with shared access mode, Databricks recommends running the workflow with a service principal. See Identity best practices.

The following table indicates the compatibility for libraries and init scripts with shared access mode. The Databricks Runtime version listed is the minimum version required to use the pattern.

Note

Shared access mode requires an admin to add Maven coordinates, JARs, and init scripts to an allowlist. See Allowlist libraries and init scripts on shared compute.

Python libraries can be installed from package repositories and workspace files on compute configured with shared access mode and Databricks Runtime 13.1 and above, and are not subject to the allowlist. See Install libraries from a package repository.

Install from

Python libraries (wheels)

Scala libraries (JARs)

Init scripts

Workspace files

13.1

Not supported

Not supported

Volumes

13.2

13.3 LTS

13.3 LTS

Cloud storage

13.1

13.3 LTS

13.3 LTS

Single user access mode support for libraries and init scripts

Databricks only recommends using single user access mode if required functionality is not supported by shared access mode.

The following table indicates the compatibility for libraries and init scripts with single user access mode. The Databricks Runtime version listed is the minimum version required to use the pattern.

Note

Single user access mode allows installation of all supported library types from package repositories. See Install libraries from a package repository.

Install from

Python libraries (wheels)

Scala libraries (JARs)

Init scripts

Workspace files

13.2

Not supported

All supported Databricks Runtime versions

Volumes

13.2

13.3 LTS

13.3 LTS

Cloud storage

All supported Databricks Runtime versions

All supported Databricks Runtime versions

All supported Databricks Runtime versions

No-isolation shared access mode support for libraries and init scripts

No-isolation shared access mode is a legacy configuration on Databricks that does not support Unity Catalog. Databricks recommends updating all compute to either shared or single user access mode.

The following table indicates the compatibility for libraries and init scripts with no-isolation shared access mode. The Databricks Runtime version listed is the minimum version required to use the pattern.

Note

No-isolation shared access mode allows installation of all supported library types from package repositories. See Install libraries from a package repository.

Install from

Python libraries

Scala libraries and JARs

Init scripts

Workspace files

14.1

14.1

All supported Databricks Runtime versions

Volumes

Not supported

Not supported

Not supported

Cloud storage

All supported Databricks Runtime versions

All supported Databricks Runtime versions

All supported Databricks Runtime versions