Library API

The libraries API alllows you to create/edit/delete libraries via the API.


All Cluster Statuses

Endpoint HTTP Method
2.0/libraries/all-cluster-statuses GET

Get the status of all libraries on all clusters. A status will be available for all libraries installed on this cluster via the API or the libraries UI as well as libraries set to be installed on all clusters via the libraries UI. If a library has been set to be installed on all clusters, is_library_for_all_clusters will be true, even if the library was also installed on this specific cluster..

An example response:

{
  "statuses": [
    {
      "cluster_id": "11203-my-cluster",
      "library_statuses": [
        {
          "library": {
            "jar": "dbfs:/mnt/libraries/library.jar"
          },
          "status": "INSTALLING",
          "messages": [],
          "is_library_for_all_clusters": false
        }
      ]
    },
    {
      "cluster_id": "20131-my-other-cluster",
      "library_statuses": [
        {
          "library": {
            "egg": "dbfs:/mnt/libraries/library.egg"
          },
          "status": "ERROR",
          "messages": ["Could not download library"],
          "is_library_for_all_clusters": false
        }
      ]
    }
  ]
}

Response Structure

Field Name Type Description
statuses An array of ClusterLibraryStatuses A list of cluster statuses.

Cluster Status

Endpoint HTTP Method
2.0/libraries/cluster-status GET

Get the status of libraries on a cluster. A status will be available for all libraries installed on this cluster via the API or the libraries UI as well as libraries set to be installed on all clusters via the libraries UI. If a library has been set to be installed on all clusters, is_library_for_all_clusters will be true, even if the library was was also installed on this specific cluster.

An example request:

/libraries/cluster-status?cluster_id=11203-my-cluster

And response:

{
  "cluster_id": "11203-my-cluster",
  "library_statuses": [
    {
      "library": {
        "jar": "dbfs:/mnt/libraries/library.jar"
      },
      "status": "INSTALLED",
      "messages": [],
      "is_library_for_all_clusters": false
    },
    {
      "library": {
        "pypi": {
          "package": "beautifulsoup4"
        },
      },
      "status": "INSTALLING",
      "messages": ["Successfully resolved package from PyPI"],
      "is_library_for_all_clusters": false
    },
    {
      "library": {
        "cran": {
          "package": "ada",
          "repo": "http://cran.us.r-project.org"
        },
      },
      "status": "FAILED",
      "messages": ["R package installation is not supported on this spark version.\nPlease upgrade to Runtime 3.2 or higher"],
      "is_library_for_all_clusters": false
    }
  ]
}

Request Structure

Field Name Type Description
cluster_id STRING Unique identifier of the cluster whose status should be retrieved. This field is required.

Response Structure

Field Name Type Description
cluster_id STRING Unique identifier for the cluster.
library_statuses An array of LibraryFullStatus Status of all libraries on the cluster.

Install

Endpoint HTTP Method
2.0/libraries/install POST

Add libraries to be installed on a cluster. The installation is asynchronous - it happens in the background after the completion of this request. Note that the actual set of libraries to be installed on a cluster is the union of the libraries specified via this method and the libraries set to be installed on all clusters via the libraries UI.

Note that CRAN libraries can only be installed on clusters running Databricks Runtime 3.2 or higher.

An example request:

{
  "cluster_id": "10201-my-cluster",
  "libraries": [
    {
      "jar": "dbfs:/mnt/libraries/library.jar"
    },
    {
      "egg": "dbfs:/mnt/libraries/library.egg"
    },
    {
      "maven": {
        "coordinates": "org.jsoup:jsoup:1.7.2",
        "exclusions": ["slf4j:slf4j"]
      }
    },
    {
      "pypi": {
        "package": "simplejson",
        "repo": "http://my-pypi-mirror.com"
      }
    },
    {
      "cran": {
        "package: "ada",
        "repo": "http://cran.us.r-project.org"
      }
    }
  ]
}

Request Structure

Field Name Type Description
cluster_id STRING Unique identifier for the cluster on which to install these libraries. This field is required.
libraries An array of Library The libraries to install.

Uninstall

Endpoint HTTP Method
2.0/libraries/uninstall POST

Set libraries to be uninstalled on a cluster. The libraries won’t be uninstalled until the cluster is restarted. Uninstalling libraries that are not installed on the cluster will have no impact but is not an error.

An example request:

{
  "cluster_id": "10201-my-cluster",
  "libraries": [
    {
      "jar": "dbfs:/mnt/libraries/library.jar"
    },
    {
      "cran": "ada"
    }
  ]
}

Request Structure

Field Name Type Description
cluster_id STRING Unique identifier for the cluster on which to uninstall these libraries. This field is required.
libraries An array of Library The libraries to uninstall.

Data Structures

ClusterLibraryStatuses

Field Name Type Description
cluster_id STRING Unique identifier for the cluster.
library_statuses An array of LibraryFullStatus Status of all libraries on the cluster.

Library

Field Name Type Description
jar OR egg OR pypi OR maven OR cran STRING OR STRING OR PythonPyPiLibrary OR MavenLibrary OR RCranLibrary

If jar, URI of the jar to be installed. Currently only DBFS and S3 URIs are supported. For example: { "jar": "dbfs:/mnt/databricks/library.jar" } or { "jar": "s3://my-bucket/library.jar" }. If S3 is used, please make sure the cluster has read access on the library. You may need to launch the cluster with an IAM role to access the S3 URI.

If egg, URI of the egg to be installed. Currently only DBFS and S3 URIs are supported. For example: { "egg": "dbfs:/my/egg" } or { "egg": "s3://my-bucket/egg" }. If S3 is used, please make sure the cluster has read access on the library. You may need to launch the cluster with an IAM role to access the S3 URI.

If pypi, specification of a PyPi library to be installed. For example: { "package": "simplejson" }

If maven, specification of a maven library to be installed. For example: { "coordinates": "org.jsoup:jsoup:1.7.2" }

If cran, specification of a CRAN library to be installed as part of the library

LibraryFullStatus

The status of the library on a specific cluster.

Field Name Type Description
library Library Unique identifier for the library.
status LibraryInstallStatus Status of installing the library on the cluster.
messages An array of STRING All the info and warning messages that have occurred so far for this library.
is_library_for_all_clusters BOOL Whether the library was set to be installed on all clusters via the libraries UI.

MavenLibrary

Field Name Type Description
coordinates STRING Gradle-style maven coordinates. For example: “org.jsoup:jsoup:1.7.2”. This field is required.
repo STRING Maven repo to install the Maven package from. If omitted, both Maven Central Repository and Spark Packages are searched.
exclusions An array of STRING

List of dependences to exclude. For example: ["slf4j:slf4j", "*:hadoop-client"].

Maven dependency exclusions: https://maven.apache.org/guides/introduction/introduction-to-optional-and-excludes-dependencies.html.

PythonPyPiLibrary

Field Name Type Description
package STRING The name of the pypi package to install. An optional exact version specification is also supported. Examples: “simplejson” and “simplejson==3.8.0”. This field is required.
repo STRING The repository where the package can be found. If not specified, the default pip index is used.

RCranLibrary

Field Name Type Description
package STRING The name of the CRAN package to install. This field is required.
repo STRING The repository where the package can be found. If not specified, the default CRAN repo is used.

LibraryInstallStatus

The status of a library on a specific cluster.

PENDING No action has yet been taken to install the library. This state should be very short lived.
RESOLVING

Metadata necessary to install the library is being retrieved from the provided repository.

For jar and egg libraries, this step is a no-op.

INSTALLING The library is actively being installed, either by adding resources to Spark or executing system commands inside the Spark nodes.
INSTALLED The library has been successfully installed and can now be used.
FAILED Some step in installation failed. More information can be found in the messages field.
UNINSTALL_ON_RESTART The library has been marked for removal. Currently, libraries can only be removed when clusters are restarted, so libraries that enter this state will remain until the cluster is restarted.