What is the Databricks CLI?

Note

This information applies to Databricks CLI versions 0.205 and above, which are in Public Preview. To find your version of the Databricks CLI, run databricks -v.

The Databricks command-line interface (also known as the Databricks CLI) provides a tool to automate the Databricks platform from your terminal, command prompt, or automation scripts.

Information for legacy Databricks CLI users

  • Databricks plans no support or new feature work for the legacy Databricks CLI.

  • For more information about the legacy Databricks CLI, see Databricks CLI (legacy).

  • To migrate from Databricks CLI version 0.18 or below to Databricks CLI version 0.205 or above, see Databricks CLI migration.

How does the Databricks CLI work?

The CLI wraps the Databricks REST API, an application programming interface (API) that uses a REST perspective to automate Databricks account and workspace resources and data. See the Databricks REST API reference.

For example, to print information about an individual cluster in a workspace, you run the CLI as follows:

databricks clusters get 1234-567890-a12bcde3

With curl, the equivalent operation is lengthier to express and is more prone to typing errors, as follows:

curl --request GET "https://${DATABRICKS_HOST}/api/2.0/clusters/get" \
     --header "Authorization: Bearer ${DATABRICKS_TOKEN}" \
     --data '{ "cluster_id": "1234-567890-a12bcde3" }'

Example: create a Databricks job

The following example uses the CLI to create a Databricks job. This job contains a single job task. This task runs the specified Databricks notebook. This notebook has a dependency on a specific version of the PyPI package named wheel. To run this task, the job temporarily creates a job cluster that exports an environment variable named PYSPARK_PYTHON. After the job runs, the cluster is terminated.

databricks jobs create --json '{
  "name": "My hello notebook job",
  "tasks": [
    {
      "task_key": "my_hello_notebook_task",
      "notebook_task": {
        "notebook_path": "/Workspace/Users/someone@example.com/hello",
        "source": "WORKSPACE"
      },
      "libraries": [
        {
          "pypi": {
            "package": "wheel==0.41.2"
          }
        }
      ],
      "new_cluster": {
        "spark_version": "13.3.x-scala2.12",
        "node_type_id": "i3.xlarge",
        "num_workers": 1,
        "spark_env_vars": {
          "PYSPARK_PYTHON": "/databricks/python3/bin/python3"
        }
      }
    }
  ]
}'

Next steps