Cluster libraries
Cluster libraries can be used by all notebooks running on a cluster. This article details using the Install library UI in the Databricks workspace.
You can install libraries to a cluster using the following approaches:
Install a library for use with a specific cluster only.
Install a workspace library that has previously been defined in the workspace. See Workspace libraries.
Install a library using an init script that runs at cluster creation time. See Install a library with an init script.
Install a library with the REST API. See the Libaries API.
Install a library with Databricks CLI. See What is the Databricks CLI?.
Install a library using Terraform. See Databricks Terraform provider and databricks_library.
When you install a library on a cluster, a notebook already attached to that cluster will not immediately see the new library. You must first detach and then reattach the notebook to the cluster.
Not all cluster access modes support all library configurations. See Compute compatibility with libraries and init scripts.
Install a library on a cluster
To install a library on a cluster:
Click
Compute in the sidebar.
Click a cluster name.
Click the Libraries tab.
Click Install New.
The Install library dialog displays.
Select one of the Library Source options, complete the instructions that appear, and then click Install.
Important
Libraries uploaded using the library UI are stored in the DBFS root. All workspace users have the ability to modify data and files stored in the DBFS root. You can avoid this by uploading libraries to workspace files or Unity Catalog volumes, using libraries in cloud object storage or using library package repositories.
Library Source |
Instructions |
---|---|
Upload |
Load a JAR or Whl file to the DBFS root and install it on the cluster. See Upload a Jar, Python egg, or Python wheel. |
File Path/S3 |
Provide the full workspace path, volume path, or URI to the library object (for example: |
PyPI |
Enter a PyPI package name. See PyPI package. |
Maven |
Specify a Maven coordinate. See Maven or Spark package. |
CRAN |
Enter the name of a package. See CRAN package. |
Workspace Library |
Select a workspace library. See Workspace libraries. |
In assigned access mode, the identity of the assigned principal (a user or service principal) is used.
In shared access mode, libraries use the identity of the user who installed the library.
Note
No-isolation shared access mode does not support volumes, but uses the same identity assignment as shared access mode.
Install a library with an init script
If your library requires custom configuration, you may not be able to install it using the workspace or cluster library interface. Instead, you can install the library using an init script.
Here is an example of an init script that uses pip to install Python libraries on a Databricks Runtime cluster at cluster initialization.
#!/bin/bash
/databricks/python/bin/pip install astropy
Uninstall a library from a cluster
Note
When you uninstall a library from a cluster, the library is removed only when you restart the cluster. Until you restart the cluster, the status of the uninstalled library appears as Uninstall pending restart.
To uninstall a library you can start from a cluster or a library:
Cluster
Click
Compute in the sidebar.
Click a cluster name.
Click the Libraries tab.
Select the checkbox next to the cluster you want to uninstall the library from, click Uninstall, then Confirm. The Status changes to Uninstall pending restart.
Library
Go to the folder containing the library.
Click the library name.
Select the checkbox next to the cluster you want to uninstall the library from, click Uninstall, then Confirm. The Status changes to Uninstall pending restart.
Click the cluster name to go to the cluster detail page.
Click Restart and Confirm to uninstall the library. The library is removed from the cluster’s Libraries tab.
View the libraries installed on a cluster
Click
Compute in the sidebar.
Click the cluster name.
Click the Libraries tab. For each library, the tab displays the name and version, type, install status, and, if uploaded, the source file.