Libraries API 2.0
The Libraries API allows you to install and uninstall libraries and get the status of libraries on a cluster.
Important
To access Databricks REST APIs, you must authenticate.
All cluster statuses
Endpoint |
HTTP Method |
---|---|
|
|
Get the status of all libraries on all clusters. A status will be available for all libraries
installed on clusters via the API or the libraries UI as well as libraries set to be
installed on all clusters via the libraries UI. If a library has been set to be installed
on all clusters, is_library_for_all_clusters
will be true
, even if the library was also
installed on this specific cluster.
Example
Request
curl --netrc --request GET \
https://<databricks-instance>/api/2.0/libraries/all-cluster-statuses \
| jq .
Replace <databricks-instance>
with the Databricks workspace instance name, for example dbc-a1b2345c-d6e7.cloud.databricks.com
.
Response
{
"statuses": [
{
"cluster_id": "11203-my-cluster",
"library_statuses": [
{
"library": {
"jar": "dbfs:/mnt/libraries/library.jar"
},
"status": "INSTALLING",
"messages": [],
"is_library_for_all_clusters": false
}
]
},
{
"cluster_id": "20131-my-other-cluster",
"library_statuses": [
{
"library": {
"egg": "dbfs:/mnt/libraries/library.egg"
},
"status": "ERROR",
"messages": ["Could not download library"],
"is_library_for_all_clusters": false
}
]
}
]
}
Response structure
Field Name |
Type |
Description |
---|---|---|
statuses |
An array of ClusterLibraryStatuses |
A list of cluster statuses. |
Cluster status
Endpoint |
HTTP Method |
---|---|
|
|
Get the status of libraries on a cluster. A status will be available for all libraries
installed on the cluster via the API or the libraries UI as well as libraries set to be
installed on all clusters via the libraries UI. If a library has been set to be installed
on all clusters, is_library_for_all_clusters
will be true
, even if the library was
also installed on the cluster.
Example
Request
curl --netrc --request GET \
'https://<databricks-instance>/api/2.0/libraries/cluster-status?cluster_id=<cluster-id>' \
| jq .
Or:
curl --netrc --get \
https://<databricks-instance>/api/2.0/libraries/cluster-status \
--data cluster_id=<cluster-id> \
| jq .
Replace:
<databricks-instance>
with the Databricks workspace instance name, for exampledbc-a1b2345c-d6e7.cloud.databricks.com
.<cluster-id>
with the Databricks workspace ID of the cluster, for example1234-567890-example123
.
Response
{
"cluster_id": "11203-my-cluster",
"library_statuses": [
{
"library": {
"jar": "dbfs:/mnt/libraries/library.jar"
},
"status": "INSTALLED",
"messages": [],
"is_library_for_all_clusters": false
},
{
"library": {
"pypi": {
"package": "beautifulsoup4"
},
},
"status": "INSTALLING",
"messages": ["Successfully resolved package from PyPI"],
"is_library_for_all_clusters": false
},
{
"library": {
"cran": {
"package": "ada",
"repo": "https://cran.us.r-project.org"
},
},
"status": "FAILED",
"messages": ["R package installation is not supported on this spark version.\nPlease upgrade to Runtime 3.2 or higher"],
"is_library_for_all_clusters": false
}
]
}
Request structure
Field Name |
Type |
Description |
---|---|---|
cluster_id |
|
Unique identifier of the cluster whose status should be retrieved. This field is required. |
Response structure
Field Name |
Type |
Description |
---|---|---|
cluster_id |
|
Unique identifier for the cluster. |
library_statuses |
An array of LibraryFullStatus |
Status of all libraries on the cluster. |
Install
Endpoint |
HTTP Method |
---|---|
|
|
Install libraries on a cluster. The installation is asynchronous - it completes in the background after the request.
Important
This call will fail if the cluster is terminated.
Installing a wheel library on a cluster is like running the pip
command against
the wheel file directly on driver and executors.
All the dependencies specified in the library setup.py
file are
installed and this requires the library name to satisfy the wheel file name
convention.
The installation on the executors happens only when a new task is launched.
With Databricks Runtime 7.1 and below, the installation order of libraries is nondeterministic. For wheel libraries,
you can ensure a deterministic installation order by creating a zip file with suffix .wheelhouse.zip
that includes all the wheel files.
Example
curl --netrc --request POST \
https://<databricks-instance>/api/2.0/libraries/install \
--data @install-libraries.json
install-libraries.json
:
{
"cluster_id": "10201-my-cluster",
"libraries": [
{
"jar": "dbfs:/mnt/libraries/library.jar"
},
{
"egg": "dbfs:/mnt/libraries/library.egg"
},
{
"whl": "dbfs:/mnt/libraries/mlflow-0.0.1.dev0-py2-none-any.whl"
},
{
"whl": "dbfs:/mnt/libraries/wheel-libraries.wheelhouse.zip"
},
{
"maven": {
"coordinates": "org.jsoup:jsoup:1.7.2",
"exclusions": ["slf4j:slf4j"]
}
},
{
"pypi": {
"package": "simplejson",
"repo": "https://my-pypi-mirror.com"
}
},
{
"cran": {
"package": "ada",
"repo": "https://cran.us.r-project.org"
}
}
]
}
Replace:
<databricks-instance>
with the Databricks workspace instance name, for exampledbc-a1b2345c-d6e7.cloud.databricks.com
.The contents of
install-libraries.json
with fields that are appropriate for your solution.
This example uses a .netrc file.
Request structure
Field Name |
Type |
Description |
---|---|---|
cluster_id |
|
Unique identifier for the cluster on which to install these libraries. This field is required. |
libraries |
An array of Library |
The libraries to install. |
Uninstall
Endpoint |
HTTP Method |
---|---|
|
|
Set libraries to be uninstalled on a cluster. The libraries aren’t uninstalled until the cluster is restarted. Uninstalling libraries that are not installed on the cluster has no impact but is not an error.
Example
curl --netrc --request POST \
https://<databricks-instance>/api/2.0/libraries/uninstall \
--data @uninstall-libraries.json
uninstall-libraries.json
:
{
"cluster_id": "10201-my-cluster",
"libraries": [
{
"jar": "dbfs:/mnt/libraries/library.jar"
},
{
"cran": "ada"
}
]
}
Replace:
<databricks-instance>
with the Databricks workspace instance name, for exampledbc-a1b2345c-d6e7.cloud.databricks.com
.The contents of
uninstall-libraries.json
with fields that are appropriate for your solution.
This example uses a .netrc file.
Request structure
Field Name |
Type |
Description |
---|---|---|
cluster_id |
|
Unique identifier for the cluster on which to uninstall these libraries. This field is required. |
libraries |
An array of Library |
The libraries to uninstall. |
Data structures
In this section:
ClusterLibraryStatuses
Field Name |
Type |
Description |
---|---|---|
cluster_id |
|
Unique identifier for the cluster. |
library_statuses |
An array of LibraryFullStatus |
Status of all libraries on the cluster. |
Library
Field Name |
Type |
Description |
---|---|---|
jar OR egg OR whl OR pypi OR maven OR cran |
|
If jar, URI of the JAR to be installed.
DBFS and S3 URIs are supported.
For example: If egg, URI of the egg to be installed.
DBFS and S3 URIs are supported.
For example: If whl, URI of the wheel or zipped wheels to be
installed. DBFS and S3 URIs are supported.
For example: If pypi, specification of a PyPI library to be
installed. Specifying the If maven, specification of a Maven library to be
installed. For example:
If cran, specification of a CRAN library to be installed. |
LibraryFullStatus
The status of the library on a specific cluster.
Field Name |
Type |
Description |
---|---|---|
library |
Unique identifier for the library. |
|
status |
Status of installing the library on the cluster. |
|
messages |
An array of |
All the info and warning messages that have occurred so far for this library. |
is_library_for_all_clusters |
|
Whether the library was set to be installed on all clusters via the libraries UI. |
MavenLibrary
Field Name |
Type |
Description |
---|---|---|
coordinates |
|
Gradle-style Maven coordinates. For example: |
repo |
|
Maven repo to install the Maven package from. If omitted, both Maven Central Repository and Spark Packages are searched. |
exclusions |
An array of |
List of dependences to exclude. For example: Maven dependency exclusions: https://maven.apache.org/guides/introduction/introduction-to-optional-and-excludes-dependencies.html. |
PythonPyPiLibrary
Field Name |
Type |
Description |
---|---|---|
package |
|
The name of the PyPI package to install. An optional exact version specification is also
supported. Examples: |
repo |
|
The repository where the package can be found. If not specified, the default pip index is used. |
RCranLibrary
Field Name |
Type |
Description |
---|---|---|
package |
|
The name of the CRAN package to install. This field is required. |
repo |
|
The repository where the package can be found. If not specified, the default CRAN repo is used. |
LibraryInstallStatus
The status of a library on a specific cluster.
Status |
Description |
---|---|
PENDING |
No action has yet been taken to install the library. This state should be very short lived. |
RESOLVING |
Metadata necessary to install the library is being retrieved from the provided repository. For Jar, Egg, and Whl libraries, this step is a no-op. |
INSTALLING |
The library is actively being installed, either by adding resources to Spark or executing system commands inside the Spark nodes. |
INSTALLED |
The library has been successfully installed. |
SKIPPED |
Installation on a Databricks Runtime 7.0 or above cluster was skipped due to Scala version incompatibility. |
FAILED |
Some step in installation failed. More information can be found in the messages field. |
UNINSTALL_ON_RESTART |
The library has been marked for removal. Libraries can be removed only when clusters are restarted, so libraries that enter this state will remain until the cluster is restarted. |