Libraries

To make third-party or custom code available to notebooks and jobs running on your clusters, you can install a library. Libraries can be written in Python, Java, Scala, and R. You can upload Java, Scala, and Python libraries and point to external packages in PyPI, Maven, and CRAN repositories.

This article focuses on performing library tasks in the workspace UI. You can also manage libraries using the Libraries CLI or the Libraries API.

Tip

Databricks installs many common libraries by default. To see which libraries are installed by default, look at the System Environment subsection of the Databricks Runtime release notes for your Databricks Runtime version.

Libraries can be installed in one of three modes: workspace, cluster-installed, and notebook-scoped.

  • Workspace libraries serve as a local repository from which you create cluster-installed libraries. A workspace library might be custom code created by your organization, or might be a particular version of an open-source library that your organization has standardized on.

  • Cluster libraries can be used by all notebooks running on a cluster. You can install a cluster library directly from a public repository such as PyPI or Maven, or create one from a previously installed workspace library.

  • Notebook-scoped Python libraries allow you to install Python libraries and create an environment scoped to a notebook session. Notebook-scoped libraries do not affect other notebooks running on the same cluster. These libraries do not persist and must be re-installed for each session.

    Use notebook-scoped libraries when you need a custom Python environment for a specific notebook. With notebook-scoped libraries, you can also save, reuse, and share Python environments.

    • Notebook-scoped libraries are available via %pip and %conda magic commands in Databricks Runtime ML 6.4 and above, and via %pip magic commands in Databricks Runtime 7.1 and above. See Notebook-scoped Python libraries.
    • Notebook-scoped libraries are available via library utilities in all Databricks Runtime versions. See Library utilities.

This section covers: