REST API 1.2

The Databricks REST API allows you to programmatically access Databricks instead of going through the web UI.

This article covers REST API 1.2. The REST API latest version, as well as REST API 2.1 and 2.0, are also available.

Important

Important

To access Databricks REST APIs, you must authenticate.

REST API use cases

  • Start Apache Spark jobs triggered from your existing production systems or from Airflow.

  • Programmatically bring up a cluster of a certain size at a fixed time of day and then shut it down at night.

API categories

  • Execution context: create unique variable namespaces where Spark commands can be called.

  • Command execution: run commands within a specific execution context.

Details

  • This REST API runs over HTTPS.

  • For retrieving information, use HTTP GET.

  • For modifying state, use HTTP POST.

  • For file upload, use multipart/form-data. Otherwise use application/json.

  • The response content type is JSON.

  • Basic authentication is used to authenticate the user for every API call.

  • User credentials are base64 encoded and are in the HTTP header for every API call. For example, Authorization: Basic YWRtaW46YWRtaW4=. If you use curl, alternatively you can store user credentials in a .netrc file.

  • For more information about using the Databricks REST API, see the Databricks REST API reference.

Get started

  • To try out the examples in this article, replace <databricks-instance> with the workspace URL of your Databricks deployment.

  • The following examples use curl and a .netrc file. You can adapt these curl examples with an HTTP library in your programming language of choice.

API reference

Get the list of clusters

Method and path:

GET /api/1.2/clusters/list

Example

Request:

curl --netrc --request GET \
  https://<databricks-instance>/api/1.2/clusters/list

Response:

[
  {
    "id": "1234-567890-span123",
    "name": "MyCluster",
    "status": "Terminated",
    "driverIp": "",
    "jdbcPort": 10000,
    "numWorkers":0
  },
  {
    "..."
  }
]

Request schema

None.

Response schema

An array of objects, with each object representing information about a cluster as follows:

Field

id

Type: string

The ID of the cluster.

name

Type: string

The name of the cluster.

status

Type: string

The status of the cluster. One of:

  • Error

  • Pending

  • Reconfiguring

  • Restarting

  • Running

  • Terminated

  • Terminating

  • Unknown

driverIp

Type: string

The IP address of the driver.

jdbcPort

Type: number

The JDBC port number.

numWorkers

Type: number

The number of workers for the cluster.

Get information about a cluster

Method and path:

GET /api/1.2/clusters/status

Example

Request:

curl --netrc --get \
  https://<databricks-instance>/api/1.2/clusters/status \
  --data clusterId=1234-567890-span123

Response:

{
  "id": "1234-567890-span123",
  "name": "MyCluster",
  "status": "Terminated",
  "driverIp": "",
  "jdbcPort": 10000,
  "numWorkers": 0
}

Request schema

Field

clusterId

Type: string

The ID of the cluster.

Response schema

An object that represents information about the cluster.

Field

id

Type: string

The ID of the cluster.

name

Type: string

The name of the cluster.

status

Type: string

The status of the cluster. One of:

  • Error

  • Pending

  • Reconfiguring

  • Restarting

  • Running

  • Terminated

  • Terminating

  • Unknown

driverIp

Type: string

The IP address of the driver.

jdbcPort

Type: number

The JDBC port number.

numWorkers

Type: number

The number of workers for the cluster.

Restart a cluster

Method and path:

POST /api/1.2/clusters/restart

Example

Request:

curl --netrc --request POST \
  https://<databricks-instance>/api/1.2/clusters/restart \
  --data clusterId=1234-567890-span123

Response:

{
  "id": "1234-567890-span123"
}

Request schema

Field

clusterId

Type: string

The ID of the cluster to restart.

Response schema

Field

id

Type: string

The ID of the cluster.

Create an execution context

Method and path:

POST /api/1.2/contexts/create

Example

Request:

curl --netrc --request POST \
  https://<databricks-instance>/api/1.2/contexts/create \
  --data clusterId=1234-567890-span123 \
  --data language=sql

Response:

{
  "id": "1234567890123456789"
}

Request schema

Field

clusterId

Type: string

The ID of the cluster to create the context for.

clusterId

Type: string

The language for the context. One of:

  • python

  • scala

  • sql

Response schema

Field

id

Type: string

The ID of the execution context.

Get information about an execution context

Method and path:

GET /api/1.2/contexts/status

Example

Request:

curl --netrc https://<databricks-instance>/api/1.2/contexts/status?clusterId=1234-567890-span123&contextId=1234567890123456789

Response:

{
  "id": "1234567890123456789",
  "status": "Running"
}

Request schema

Field

clusterId

Type: string

The ID of the cluster to get execution context information about.

contextId

Type: string

The ID of the execution context.

Response schema

Field

id

Type: string

The ID of the execution context.

status

Type: string

The status of the execution context. One of:

  • Error

  • Pending

  • Running

Delete an execution context

Method and path:

POST /api/1.2/contexts/destroy

Example

Request:

curl --netrc --request POST \
  https://<databricks-instance>/api/1.2/contexts/destroy \
  --data clusterId=1234-567890-span123 \
  --data contextId=1234567890123456789

Response:

{
  "id": "1234567890123456789"
}

Request schema

Field

clusterId

Type: string

The ID of the cluster to destroy the execution context for.

contextId

Type: string

The ID of the execution context to destroy.

Response schema

Field

id

Type: string

The ID of the execution context.

Run a command

Method and path:

POST /api/1.2/commands/execute

Example

Request:

curl --netrc --request POST \
  https://<databricks-instance>/api/1.2/commands/execute \
  --header 'Content-Type: application/json' \
  --data @execute-command.json

execute-command.json:

{
   "clusterId": "1234-567890-span123",
   "contextId": "1234567890123456789",
   "language": "python",
   "command": "print('Hello, World!')"
}

Response:

{
  "id": "1234ab56-7890-1cde-234f-5abcdef67890"
}

Request schema

Field

clusterId

Type: string

The ID of the cluster to run the command on.

contextId

Type: string

The ID of the execution context to run the command within.

language

Type: string

The language of the command.

command

Type: string

The command string to run.

Specify either command or commandFile.

commandFile

Type: string

The path to a file containing the command to run.

Specify either commandFile or command.

options

Type: string

An optional map of values used downstream. For example, a displayRowLimit override (used in testing).

Response schema

Field

id

Type: string

The ID of the command.

Get information about a command

Method and path:

GET /api/1.2/commands/status

Example

Request:

curl --netrc --get \
  https://<databricks-instance>/api/1.2/commands/status \
  --data clusterId=1234-567890-span123 \
  --data contextId=1234567890123456789 \
  --data commandId=1234ab56-7890-1cde-234f-5abcdef67890

Response:

{
  "id": "1234ab56-7890-1cde-234f-5abcdef67890",
  "status": "Finished",
  "results": {
    "resultType": "text",
    "data": "Hello, World!"
  }
}

Request schema

Field

clusterId

Type: string

The ID of the cluster to get the command information about.

contextId

Type: string

The ID of the execution context that is associated with the command.

commandId

Type: string

The ID of the command to get information about.

Response schema

Field

id

Type: string

The ID of the command.

status

Type: string

The status of the command. One of:

  • Cancelled

  • Cancelling

  • Error

  • Finished

  • Queued

  • Running

results

Type: object

The results of the command.

  • resultType: The type of result.

    Type: string

    One of:

    • error

    • image

    • images

    • table

    • text

For error:

  • cause: The cause of the error.

    Type: string

For image:

  • fileName: The image filename.

    Type: string

For images:

  • fileNames: The images’ filenames.

    Type: array of string

For table:

  • data: The table data.

    Type: array of array of any

  • schema: The table schema.

    Type: array of array of (string, any)

  • truncated: true if partial results are returned.

    Type: true/false

  • isJsonSchema: true if a JSON schema is returned instead of a string representation of the Hive type.

    Type: true/false

For text:

  • data: The text.

    Type: string

Cancel a command

Method and path:

POST/api/1.2/commands/cancel

Example

Request:

curl --netrc --request POST \
  https://<databricks-instance>/api/1.2/commands/cancel \
  --data clusterId=1234-567890-span123 \
  --data contextId=1234567890123456789 \
  --data commandId=1234ab56-7890-1cde-234f-5abcdef67890

Response:

{
  "id": "1234ab56-7890-1cde-234f-5abcdef67890"
}

Request schema

Field

clusterId

Type: string

The ID of the cluster that is associated with the command to cancel.

contextId

Type: string

The ID of the execution context that is associated with the command to cancel.

commandId

Type: string

The ID of the command to cancel.

Response schema

Field

id

Type: string

The ID of the command.

Get the list of libraries for a cluster

Important

This operation is deprecated. Use the Cluster status operation in the Libraries API instead.

Method and path:

GET /api/1.2/libraries/list

Example

Request:

curl --netrc --get \
  https://<databricks-instance>/api/1.2/libraries/list \
  --data clusterId=1234-567890-span123

Request schema

Field

clusterId

Type: string

The ID of the cluster.

Response schema

An array of objects, with each object representing information about a library as follows:

Field

name

Type: string

The name of the library.

status

Type: string

The status of the library. One of:

  • LibraryError

  • LibraryLoaded

  • LibraryPending

Upload a library to a cluster

Important

This operation is deprecated. Use the Install operation in the Libraries API instead.

Method and path:

POST /api/1.2/libraries/upload

Request schema

Field

clusterId

Type: string

The ID of the cluster to upload the library to.

name

Type: string

The name of the library.

language

Type: string

The language of the library.

uri

Type: string

The URI of the library.

The scheme can be file, http, or https.

Response schema

Information about the uploaded library.

Field

language

Type: string

The language of the library.

uri

Type: string

The URI of the library.

Additional examples

The following additional examples provide commands that you can use with curl or adapt with an HTTP library in your programming language of choice.

Create an execution context

Create an execution context on a specified cluster for a given programming language:

curl --netrc --request POST \
  https://<databricks-instance>/api/1.2/contexts/create \
  --header 'Content-Type: application/json' \
  --data '{ "language": "scala", "clusterId": "1234-567890-span123" }'

Get information about the execution context:

curl --netrc --get \
  https://<databricks-instance>/api/1.2/contexts/status \
  --data 'clusterId=1234-567890-span123&contextId=1234567890123456789'

Delete the execution context:

curl --netrc --request POST \
  https://<databricks-instance>/api/1.2/contexts/destroy \
  --header 'Content-Type: application/json' \
  --data '{ "contextId": "1234567890123456789", "clusterId": "1234-567890-span123" }'

Run a command

Known limitations: command execution does not support %run.

Run a command string:

curl --netrc --request POST \
  https://<databricks-instance>/api/1.2/commands/execute \
  --header 'Content-Type: application/json' \
  --data '{ "language": "scala", "clusterId": "1234-567890-span123", "contextId": "1234567890123456789", "command": "sc.parallelize(1 to 10).collect" }'

Run a file:

curl --netrc --request POST \
  https://<databricks-instance>/api/1.2/commands/execute \
  --header 'Content-Type: multipart/form-data' \
  --form language=python \
  --form clusterId=1234-567890-span123 \
  --form contextId=1234567890123456789 \
  --form command=@myfile.py

Show the command’s status and result:

curl --netrc --get \
  https://<databricks-instance>/api/1.2/commands/status \
  --data 'clusterId=1234-567890-span123&contextId=1234567890123456789&commandId=1234ab56-7890-1cde-234f-5abcdef67890'

Cancel the command:

curl --netrc --request POST \
  https://<databricks-instance>/api/1.2/commands/cancel \
  --data 'clusterId=1234-567890-span123&contextId=1234567890123456789&commandId=1234ab56-7890-1cde-234f-5abcdef67890' \

Upload and run a Spark JAR

Upload a JAR

Use the REST API (latest) to upload a JAR and attach it to a cluster.

Run a JAR

  1. Create an execution context.

    curl --netrc --request POST \
      https://<databricks-instance>/api/1.2/contexts/create \
      --data "language=scala&clusterId=1234-567890-span123"
    
    {
      "id": "1234567890123456789"
    }
    
  2. Execute a command that uses your JAR.

    curl --netrc --request POST \
      https://<databricks-instance>/api/1.2/commands/execute \
      --data 'language=scala&clusterId=1234-567890-span123&contextId=1234567890123456789&command=println(com.databricks.apps.logs.chapter1.LogAnalyzer.processLogFile(sc,null,"dbfs:/somefile.log"))'
    
    {
      "id": "1234ab56-7890-1cde-234f-5abcdef67890"
    }
    
  3. Check on the status of your command. It may not return immediately if you are running a lengthy Spark job.

    curl --netrc 'https://<databricks-instance>/api/1.2/commands/status?clusterId=1234-567890-span123&contextId=1234567890123456789&commandId=1234ab56-7890-1cde-234f-5abcdef67890'
    
    {
       "id": "1234ab56-7890-1cde-234f-5abcdef67890",
       "results": {
         "data": "Content Size Avg: 1234, Min: 1234, Max: 1234",
         "resultType": "text"
       },
       "status": "Finished"
    }
    

    Allowed values for resultType include:

    • error

    • image

    • images

    • table

    • text