Libraries API 2.0

The Libraries API allows you to install and uninstall libraries and get the status of libraries on a cluster.

Important

To access Databricks REST APIs, you must authenticate.

All cluster statuses

Endpoint

HTTP Method

2.0/libraries/all-cluster-statuses

GET

Get the status of all libraries on all clusters. A status will be available for all libraries installed on clusters via the API or the libraries UI as well as libraries set to be installed on all clusters via the libraries UI. If a library has been set to be installed on all clusters, is_library_for_all_clusters will be true, even if the library was also installed on this specific cluster.

Example

Request

curl --netrc --request GET \
https://<databricks-instance>/api/2.0/libraries/all-cluster-statuses \
| jq .

Replace <databricks-instance> with the Databricks workspace instance name, for example dbc-a1b2345c-d6e7.cloud.databricks.com.

This example uses a .netrc file and jq.

Response

{
  "statuses": [
    {
      "cluster_id": "11203-my-cluster",
      "library_statuses": [
        {
          "library": {
            "jar": "dbfs:/mnt/libraries/library.jar"
          },
          "status": "INSTALLING",
          "messages": [],
          "is_library_for_all_clusters": false
        }
      ]
    },
    {
      "cluster_id": "20131-my-other-cluster",
      "library_statuses": [
        {
          "library": {
            "egg": "dbfs:/mnt/libraries/library.egg"
          },
          "status": "ERROR",
          "messages": ["Could not download library"],
          "is_library_for_all_clusters": false
        }
      ]
    }
  ]
}

Response structure

Field Name

Type

Description

statuses

An array of ClusterLibraryStatuses

A list of cluster statuses.

Cluster status

Endpoint

HTTP Method

2.0/libraries/cluster-status

GET

Get the status of libraries on a cluster. A status will be available for all libraries installed on the cluster via the API or the libraries UI as well as libraries set to be installed on all clusters via the libraries UI. If a library has been set to be installed on all clusters, is_library_for_all_clusters will be true, even if the library was also installed on the cluster.

Example

Request

curl --netrc --request GET \
'https://<databricks-instance>/api/2.0/libraries/cluster-status?cluster_id=<cluster-id>' \
| jq .

Or:

curl --netrc --get \
https://<databricks-instance>/api/2.0/libraries/cluster-status \
--data cluster_id=<cluster-id> \
| jq .

Replace:

  • <databricks-instance> with the Databricks workspace instance name, for example dbc-a1b2345c-d6e7.cloud.databricks.com.

  • <cluster-id> with the Databricks workspace ID of the cluster, for example 1234-567890-example123.

This example uses a .netrc file and jq.

Response

{
  "cluster_id": "11203-my-cluster",
  "library_statuses": [
    {
      "library": {
        "jar": "dbfs:/mnt/libraries/library.jar"
      },
      "status": "INSTALLED",
      "messages": [],
      "is_library_for_all_clusters": false
    },
    {
      "library": {
        "pypi": {
          "package": "beautifulsoup4"
        },
      },
      "status": "INSTALLING",
      "messages": ["Successfully resolved package from PyPI"],
      "is_library_for_all_clusters": false
    },
    {
      "library": {
        "cran": {
          "package": "ada",
          "repo": "https://cran.us.r-project.org"
        },
      },
      "status": "FAILED",
      "messages": ["R package installation is not supported on this spark version.\nPlease upgrade to Runtime 3.2 or higher"],
      "is_library_for_all_clusters": false
    }
  ]
}

Request structure

Field Name

Type

Description

cluster_id

STRING

Unique identifier of the cluster whose status should be retrieved. This field is required.

Response structure

Field Name

Type

Description

cluster_id

STRING

Unique identifier for the cluster.

library_statuses

An array of LibraryFullStatus

Status of all libraries on the cluster.

Install

Endpoint

HTTP Method

2.0/libraries/install

POST

Install libraries on a cluster. The installation is asynchronous - it completes in the background after the request.

Important

This call will fail if the cluster is terminated.

Installing a wheel library on a cluster is like running the pip command against the wheel file directly on driver and executors. All the dependencies specified in the library setup.py file are installed and this requires the library name to satisfy the wheel file name convention.

The installation on the executors happens only when a new task is launched. With Databricks Runtime 7.1 and below, the installation order of libraries is nondeterministic. For wheel libraries, you can ensure a deterministic installation order by creating a zip file with suffix .wheelhouse.zip that includes all the wheel files.

Example

curl --netrc --request POST \
https://<databricks-instance>/api/2.0/libraries/install \
--data @install-libraries.json

install-libraries.json:

{
  "cluster_id": "10201-my-cluster",
  "libraries": [
    {
      "jar": "dbfs:/mnt/libraries/library.jar"
    },
    {
      "egg": "dbfs:/mnt/libraries/library.egg"
    },
    {
      "whl": "dbfs:/mnt/libraries/mlflow-0.0.1.dev0-py2-none-any.whl"
    },
    {
      "whl": "dbfs:/mnt/libraries/wheel-libraries.wheelhouse.zip"
    },
    {
      "maven": {
        "coordinates": "org.jsoup:jsoup:1.7.2",
        "exclusions": ["slf4j:slf4j"]
      }
    },
    {
      "pypi": {
        "package": "simplejson",
        "repo": "https://my-pypi-mirror.com"
      }
    },
    {
      "cran": {
        "package": "ada",
        "repo": "https://cran.us.r-project.org"
      }
    }
  ]
}

Replace:

  • <databricks-instance> with the Databricks workspace instance name, for example dbc-a1b2345c-d6e7.cloud.databricks.com.

  • The contents of install-libraries.json with fields that are appropriate for your solution.

This example uses a .netrc file.

Request structure

Field Name

Type

Description

cluster_id

STRING

Unique identifier for the cluster on which to install these libraries. This field is required.

libraries

An array of Library

The libraries to install.

Uninstall

Endpoint

HTTP Method

2.0/libraries/uninstall

POST

Set libraries to be uninstalled on a cluster. The libraries aren’t uninstalled until the cluster is restarted. Uninstalling libraries that are not installed on the cluster has no impact but is not an error.

Example

curl --netrc --request POST \
https://<databricks-instance>/api/2.0/libraries/uninstall \
--data @uninstall-libraries.json

uninstall-libraries.json:

{
  "cluster_id": "10201-my-cluster",
  "libraries": [
    {
      "jar": "dbfs:/mnt/libraries/library.jar"
    },
    {
      "cran": "ada"
    }
  ]
}

Replace:

  • <databricks-instance> with the Databricks workspace instance name, for example dbc-a1b2345c-d6e7.cloud.databricks.com.

  • The contents of uninstall-libraries.json with fields that are appropriate for your solution.

This example uses a .netrc file.

Request structure

Field Name

Type

Description

cluster_id

STRING

Unique identifier for the cluster on which to uninstall these libraries. This field is required.

libraries

An array of Library

The libraries to uninstall.

Data structures

ClusterLibraryStatuses

Field Name

Type

Description

cluster_id

STRING

Unique identifier for the cluster.

library_statuses

An array of LibraryFullStatus

Status of all libraries on the cluster.

Library

Field Name

Type

Description

jar OR egg OR whl OR pypi OR maven OR cran

STRING OR STRING OR STRING OR PythonPyPiLibrary OR MavenLibrary OR RCranLibrary

If jar, URI of the JAR to be installed. DBFS and S3 URIs are supported. For example: { "jar": "dbfs:/mnt/databricks/library.jar" } or { "jar": "s3://my-bucket/library.jar" }. If S3 is used, make sure the cluster has read access on the library. You may need to launch the cluster with an instance profile to access the S3 URI.

If egg, URI of the egg to be installed. DBFS and S3 URIs are supported. For example: { "egg": "dbfs:/my/egg" } or { "egg": "s3://my-bucket/egg" }. If S3 is used, make sure the cluster has read access on the library. You may need to launch the cluster with an instance profile to access the S3 URI.

If whl, URI of the wheel or zipped wheels to be installed. DBFS and S3 URIs are supported. For example: { "whl": "dbfs:/my/whl" } or { "whl": "s3://my-bucket/whl" }. If S3 is used, make sure the cluster has read access on the library. You may need to launch the cluster with an instance profile to access the S3 URI. Also the wheel file name needs to use the correct convention. If zipped wheels are to be installed, the file name suffix should be .wheelhouse.zip.

If pypi, specification of a PyPI library to be installed. Specifying the repo field is optional and if not specified, the default pip index is used. For example: { "package": "simplejson", "repo": "https://my-repo.com" }

If maven, specification of a Maven library to be installed. For example: { "coordinates": "org.jsoup:jsoup:1.7.2" }

If cran, specification of a CRAN library to be installed.

LibraryFullStatus

The status of the library on a specific cluster.

Field Name

Type

Description

library

Library

Unique identifier for the library.

status

LibraryInstallStatus

Status of installing the library on the cluster.

messages

An array of STRING

All the info and warning messages that have occurred so far for this library.

is_library_for_all_clusters

BOOL

Whether the library was set to be installed on all clusters via the libraries UI.

MavenLibrary

Field Name

Type

Description

coordinates

STRING

Gradle-style Maven coordinates. For example: org.jsoup:jsoup:1.7.2. This field is required.

repo

STRING

Maven repo to install the Maven package from. If omitted, both Maven Central Repository and Spark Packages are searched.

exclusions

An array of STRING

List of dependences to exclude. For example: ["slf4j:slf4j", "*:hadoop-client"].

Maven dependency exclusions: https://maven.apache.org/guides/introduction/introduction-to-optional-and-excludes-dependencies.html.

PythonPyPiLibrary

Field Name

Type

Description

package

STRING

The name of the PyPI package to install. An optional exact version specification is also supported. Examples: simplejson and simplejson==3.8.0. This field is required.

repo

STRING

The repository where the package can be found. If not specified, the default pip index is used.

RCranLibrary

Field Name

Type

Description

package

STRING

The name of the CRAN package to install. This field is required.

repo

STRING

The repository where the package can be found. If not specified, the default CRAN repo is used.

LibraryInstallStatus

The status of a library on a specific cluster.

Status

Description

PENDING

No action has yet been taken to install the library. This state should be very short lived.

RESOLVING

Metadata necessary to install the library is being retrieved from the provided repository.

For Jar, Egg, and Whl libraries, this step is a no-op.

INSTALLING

The library is actively being installed, either by adding resources to Spark or executing system commands inside the Spark nodes.

INSTALLED

The library has been successfully installed.

SKIPPED

Installation on a Databricks Runtime 7.0 or above cluster was skipped due to Scala version incompatibility.

FAILED

Some step in installation failed. More information can be found in the messages field.

UNINSTALL_ON_RESTART

The library has been marked for removal. Libraries can be removed only when clusters are restarted, so libraries that enter this state will remain until the cluster is restarted.