Cluster libraries

Cluster libraries can be used by all notebooks running on a cluster. This article details using the Install library UI in the Databricks workspace.

Note

If you create compute using a policy that enforces library installations, you can’t install or uninstall libraries on your compute. Workspace admins control all library management at the policy level.

You can install libraries to a cluster using the following approaches:

When you install a library on a cluster, a notebook already attached to that cluster will not immediately see the new library. You must first detach and then reattach the notebook to the cluster.

Not all cluster access modes support all library configurations. See Compute compatibility with libraries and init scripts.

Install a library on a cluster

To install a library on a cluster:

  1. Click compute icon Compute in the sidebar.

  2. Click a cluster name.

  3. Click the Libraries tab.

  4. Click Install New.

  5. The Install library dialog displays.

  6. Select one of the Library Source options, complete the instructions that appear, and then click Install.

Important

Default behavior for the library upload UI has changed. Legacy behavior always stored libraries in the DBFS root. All workspace users have the ability to modify data and files stored in the DBFS root.

The default location for library uploads is now workspace files. Databricks recommends uploading libraries to workspace files or Unity Catalog volumes, or using library package repositories. If your workload does not support these patterns, you can also use libraries stored in cloud object storage.

Library source

Instructions

Workspace

Select a workspace file or upload a Whl, zipped wheelhouse, JAR, ZIP, tar, or requirements.txt file.

Volumes

Select a Whl or JAR file from a volume.

File Path/S3

Select the library type and provide the full URI to the library object (for example: /Workspace/path/to/library.whl, /Volumes/path/to/library.whl, or s3://bucket-name/path/to/library.whl).

PyPI

Enter a PyPI package name. See PyPI package.

Maven

Specify a Maven coordinate. See Maven or Spark package.

CRAN

Enter the name of a package. See CRAN package.

DBFS (Not recommended)

Load a JAR or Whl file to the DBFS root. This is not recommended, as files stored in DBFS can be modified by any workspace user.

In single user access mode, the identity of the assigned principal (a user or service principal) is used.

In shared access mode, libraries use the identity of the user who installed the library.

Note

No-isolation shared access mode does not support volumes, but uses the same identity assignment as shared access mode.

Install a library using a policy

If you create a cluster using a policy that enforces library installation, specified libraries automatically install on your cluster. You cannot install additional libraries or uninstall any libraries.

Workspace admins can add libraries to policies, allowing them to manage and enforce library installations on all compute that uses the policy. For admin instructions, see Add libraries to a policy.

Uninstall a library from a cluster

Note

When you uninstall a library from a cluster, the library is removed only when you restart the cluster. Until you restart the cluster, the status of the uninstalled library appears as Uninstall pending restart.

To uninstall a library you can start from a cluster or a library:

Cluster

  1. Click compute icon Compute in the sidebar.

  2. Click a cluster name.

  3. Click the Libraries tab.

  4. Select the checkbox next to the cluster you want to uninstall the library from, click Uninstall, then Confirm. The Status changes to Uninstall pending restart.

Library

  1. Go to the folder containing the library.

  2. Click the library name.

  3. Select the checkbox next to the cluster you want to uninstall the library from, click Uninstall, then Confirm. The Status changes to Uninstall pending restart.

  4. Click the cluster name to go to the cluster detail page.

Click Restart and Confirm to uninstall the library. The library is removed from the cluster’s Libraries tab.

View the libraries installed on a cluster

  1. Click compute icon Compute in the sidebar.

  2. Click the cluster name.

  3. Click the Libraries tab. For each library, the tab displays the name and version, type, install status, and, if uploaded, the source file.

Update a cluster-installed library

To update a cluster-installed library, uninstall the old version of the library and install a new version.