Workspace libraries
Workspace libraries serve as a local repository from which you create cluster-installed libraries. A workspace library might be custom code created by your organization, or might be a particular version of an open-source library that your organization has standardized on.
You must install a workspace library on a cluster before it can be used in a notebook or job.
Workspace libraries in the Shared folder are available to all users in a workspace, while workspace libraries in a user folder are available only to that user.
Create a workspace library
Right-click the workspace folder where you want to store the library.
Select Create > Library.
The Create Library dialog displays.
Select the Library Source and follow the appropriate procedure:
Upload a Jar, Python Egg, or Python Wheel
- In the Library Source button list, select Upload.
- Select Jar, Python Egg, or Python Whl.
- Optionally enter a library name.
- Drag your Jar, Egg, or Whl to the drop box or click the drop box and navigate to a file. The file is uploaded to
dbfs:/FileStore/jars
. - Click Create. The library status screen displays.
- Optionally install the library on a cluster.
Reference an uploaded Jar, Python Egg, or Python Wheel
If you’ve already uploaded a Jar, Egg, or Wheel to object storage you can reference it in a workspace library.
You can choose a library in DBFS or one stored in S3.
- Select DBFS/S3 in the Library Source button list.
- Select Jar, Python Egg, or Python Whl.
- Optionally enter a library name.
- Specify the DBFS or S3 path to the library.
- Click Create. The library status screen displays.
- Optionally install the library on a cluster.
PyPI package
- In the Library Source button list, select PyPI.
- Enter a PyPI package name. To install a specific version of a library use this format for the library:
<library>==<version>
. For example,scikit-learn==0.19.1
. - In the Repository field, optionally enter a PyPI repository URL.
- Click Create. The library status screen displays.
- Optionally install the library on a cluster.
Maven or Spark package
In the Library Source button list, select Maven.
Specify a Maven coordinate. Do one of the following:
- In the Coordinate field, enter the Maven coordinate of the library to install. Maven coordinates are in the form
groupId:artifactId:version
; for example,com.databricks:spark-avro_2.10:1.0.0
. - If you don’t know the exact coordinate, enter the library name and click Search Packages. A list of matching packages displays. To display details about a package, click its name. You can sort packages by name, organization, and rating. You can also filter the results by writing a query in the search bar. The results refresh automatically.
- Select Maven Central or Spark Packages in the drop-down list at the top left.
- Optionally select the package version in the Releases column.
- Click + Select next to a package. The Coordinate field is filled in with the selected package and version.
- In the Coordinate field, enter the Maven coordinate of the library to install. Maven coordinates are in the form
In the Repository field, optionally enter a Maven repository URL.
Note
Internal Maven repositories are not supported.
In the Exclusions field, optionally provide the
groupId
and theartifactId
of the dependencies that you want to exclude; for example,log4j:log4j
.Click Create. The library status screen displays.
Optionally install the library on a cluster.
CRAN package
- In the Library Source button list, select CRAN.
- In the Package field, enter the name of the package.
- In the Repository field, optionally enter the CRAN repository URL.
- Click Create. The library detail screen displays.
- Optionally install the library on a cluster.
Note
CRAN mirrors serve the latest version of a library. As a result, you may end up with different versions of an R package if you attach the library to different clusters at different times. To learn how to manage and fix R package versions on Databricks, see the Knowledge Base.
View workspace library details
- Go to the workspace folder containing the library.
- Click the library name.
The library details page shows the running clusters and the install status of the library. If the library is installed, the page contains a link to the package host. If the library was uploaded, the page displays a link to the uploaded package file.
Move a workspace library
- Go to the workspace folder containing the library.
- Click the drop-down arrow
to the right of the library name and select Move. A folder browser displays.
- Click the destination folder.
- Click Select.
- Click Confirm and Move.
Delete a workspace library
Important
Before deleting a workspace library, you should uninstall it from all clusters.
To delete a workspace library:
- Move the library to the Trash folder.
- Either permanently delete the library in the Trash folder or empty the Trash folder.