Workspace libraries

Workspace libraries serve as a local repository from which you create cluster-installed libraries. A workspace library might be custom code created by your organization, or might be a particular version of an open-source library that your organization has standardized on.

You must install a workspace library on a cluster before it can be used in a notebook or job.

Workspace libraries in the Shared folder are available to all users in a workspace, while workspace libraries in a user folder are available only to that user.

Create a workspace library

  1. Right-click the workspace folder where you want to store the library.

  2. Select Create > Library.

    Create library

    The Create Library dialog displays.

    Library options
  3. Select the Library Source and follow the appropriate procedure:

Upload a Jar, Python Egg, or Python Wheel

  1. In the Library Source button list, select Upload.
  2. Select Jar, Python Egg, or Python Whl.
  3. Optionally enter a library name.
  4. Drag your Jar, Egg, or Whl to the drop box or click the drop box and navigate to a file. The file is uploaded to dbfs:/FileStore/jars.
  5. Click Create. The library status screen displays.
  6. Optionally install the library on a cluster.

Reference an uploaded Jar, Python Egg, or Python Wheel

If you’ve already uploaded a Jar, Egg, or Wheel to object storage you can reference it in a workspace library.

You can choose a library in DBFS or one stored in S3.

  1. Select DBFS/S3 in the Library Source button list.
  2. Select Jar, Python Egg, or Python Whl.
  3. Optionally enter a library name.
  4. Specify the DBFS or S3 path to the library.
  5. Click Create. The library status screen displays.
  6. Optionally install the library on a cluster.

PyPI package

  1. In the Library Source button list, select PyPI.
  2. Enter a PyPI package name. To install a specific version of a library use this format for the library: <library>==<version>. For example, scikit-learn==0.19.1.
  3. In the Repository field, optionally enter a PyPI repository URL.
  4. Click Create. The library status screen displays.
  5. Optionally install the library on a cluster.

Maven or Spark package

  1. In the Library Source button list, select Maven.

  2. Specify a Maven coordinate. Do one of the following:

    • In the Coordinate field, enter the Maven coordinate of the library to install. Maven coordinates are in the form groupId:artifactId:version; for example, com.databricks:spark-avro_2.10:1.0.0.
    • If you don’t know the exact coordinate, enter the library name and click Search Packages. A list of matching packages displays. To display details about a package, click its name. You can sort packages by name, organization, and rating. You can also filter the results by writing a query in the search bar. The results refresh automatically.
      1. Select Maven Central or Spark Packages in the drop-down list at the top left.
      2. Optionally select the package version in the Releases column.
      3. Click + Select next to a package. The Coordinate field is filled in with the selected package and version.
  3. In the Repository field, optionally enter a Maven repository URL.

    Note

    Internal Maven repositories are not supported.

  4. In the Exclusions field, optionally provide the groupId and the artifactId of the dependencies that you want to exclude; for example, log4j:log4j.

  5. Click Create. The library status screen displays.

  6. Optionally install the library on a cluster.

CRAN package

  1. In the Library Source button list, select CRAN.
  2. In the Package field, enter the name of the package.
  3. In the Repository field, optionally enter the CRAN repository URL.
  4. Click Create. The library detail screen displays.
  5. Optionally install the library on a cluster.

Note

CRAN mirrors serve the latest version of a library. As a result, you may end up with different versions of an R package if you attach the library to different clusters at different times. To learn how to manage and fix R package versions on Databricks, see the Knowledge Base.

View workspace library details

  1. Go to the workspace folder containing the library.
  2. Click the library name.

The library details page shows the running clusters and the install status of the library. If the library is installed, the page contains a link to the package host. If the library was uploaded, the page displays a link to the uploaded package file.

Move a workspace library

  1. Go to the workspace folder containing the library.
  2. Click the drop-down arrow Menu Dropdown to the right of the library name and select Move. A folder browser displays.
  3. Click the destination folder.
  4. Click Select.
  5. Click Confirm and Move.

Delete a workspace library

Important

Before deleting a workspace library, you should uninstall it from all clusters.

To delete a workspace library:

  1. Move the library to the Trash folder.
  2. Either permanently delete the library in the Trash folder or empty the Trash folder.