What is the Databricks CLI?
Note
This information applies to Databricks CLI versions 0.205 and above, which are in Public Preview. To find your version of the Databricks CLI, run databricks -v
.
The Databricks command-line interface (also known as the Databricks CLI) provides a tool to automate the Databricks platform from your terminal, command prompt, or automation scripts. You can also run Databricks CLI commands from within a Databricks workspace using web terminal. See Run shell commands in Databricks web terminal.
Information for legacy Databricks CLI users
Databricks plans no support or new feature work for the legacy Databricks CLI.
For more information about the legacy Databricks CLI, see Databricks CLI (legacy).
To migrate from Databricks CLI version 0.18 or below to Databricks CLI version 0.205 or above, see Databricks CLI migration.
How does the Databricks CLI work?
The CLI wraps the Databricks REST API, which provides endpoints for modifying or requesting information about Databricks account and workspace objects. See the Databricks REST API reference.
For example, to print information about an individual cluster in a workspace, you run the CLI as follows:
databricks clusters get 1234-567890-a12bcde3
With curl
, the equivalent operation is as follows:
curl --request GET "https://${DATABRICKS_HOST}/api/2.0/clusters/get" \
--header "Authorization: Bearer ${DATABRICKS_TOKEN}" \
--data '{ "cluster_id": "1234-567890-a12bcde3" }'
Example: create a Databricks job
The following example uses the CLI to create a Databricks job. This job contains a single job task. This task runs the specified Databricks notebook. This notebook has a dependency on a specific version of the PyPI package named wheel
. To run this task, the job temporarily creates a job cluster that exports an environment variable named PYSPARK_PYTHON
. After the job runs, the cluster is terminated.
databricks jobs create --json '{
"name": "My hello notebook job",
"tasks": [
{
"task_key": "my_hello_notebook_task",
"notebook_task": {
"notebook_path": "/Workspace/Users/someone@example.com/hello",
"source": "WORKSPACE"
},
"libraries": [
{
"pypi": {
"package": "wheel==0.41.2"
}
}
],
"new_cluster": {
"spark_version": "13.3.x-scala2.12",
"node_type_id": "i3.xlarge",
"num_workers": 1,
"spark_env_vars": {
"PYSPARK_PYTHON": "/databricks/python3/bin/python3"
}
}
}
]
}'
Next steps
To learn how to install and start using the CLI, see the Databricks CLI tutorial.
To skip the tutorial and just install the CLI, see Install or update the Databricks CLI.
To set up authentication between the CLI and your Databricks accounts and workspaces, see Authentication for the Databricks CLI.
To use configuration profiles to quickly switch between related groups of CLI settings, see Configuration profiles for the Databricks CLI.
To learn about basic usage for the CLI, see Basic usage for the Databricks CLI.
To get help for CLI commands, see Databricks CLI commands.