Databricks CLI (versions 0.100 and higher)

Note

This information applies to Databricks CLI versions 0.100 and higher, which are in Private Preview. To try them, reach out to your Databricks contact. To find your version of the Databricks CLI, run databricks -v.

The Databricks command-line interface (also known as the Databricks CLI) utility, provided by Databricks, provides an easy-to-use interface to automate the Databricks platform from your terminal, command prompt, or automation scripts.

To skip ahead to usage information about all of the available commands for Databricks CLI versions 0.100 and higher, see CLI command groups.

Note

This feature is in Private Preview. To try it, reach out to your Databricks contact.

Databricks CLI versions 0.100 and higher are separate from Databricks CLI versions 0.99 and lower.

  • Databricks does not guarantee that scripts and commands targeting Databricks CLI versions 0.99 and lower will run without modification for Databricks CLI versions 0.100 and higher.

  • Databricks does not guarantee that scripts and commands targeting Databricks CLI versions 0.100 and higher will run without modification for Databricks CLI versions 0.99 and lower.

Set up the CLI

The following sections describe how to set up the Databricks CLI.

Requirements

There are no special requirements before you install the Databricks CLI. Databricks provides the Databricks CLI as a standalone executable for Linux, macOS, and Windows operating systems.

Install or update the CLI

This section describes how to install or update your development machine to run the Databricks CLI executable.

Install the CLI

  1. Download onto your local development machine the correct Databricks CLI .zip file, as provided by your Databricks contact. The .zip file must match your development machine’s operating system and architecture:

    Filename

    Architecture

    databricks_cli_X.Y.Z_darwin_amd64.zip

    macOS, Intel 64-bit

    databricks_cli_X.Y.Z_darwin_arm64.zip

    macOS, Apple Silicon

    databricks_cli_X.Y.Z_linux_amd64.zip

    Linux, Intel 64-bit

    databricks_cli_X.Y.Z_linux_arm64.zip

    Linux, ARM 64-bit

    databricks_cli_X.Y.Z_windows_386.zip

    Windows, Intel 32-bit

    databricks_cli_X.Y.Z_windows_amd64.zip

    Windows, Intel 64-bit

    databricks_cli_X.Y.Z_windows_arm64.zip

    Windows, ARM 64-bit

    To get your machine’s architecture, see your operating system’s documentation.

    Databricks also provides a checksum file named databricks_cli_X.Y.Z_SHA256SUMS if you need to verify the integrity of the .zip file. To get this checksum file, reach out to your Databricks contact. To run a checksum verification, see your operating system’s documentation.

  2. Extract the contents of the downloaded .zip file. (To extract the .zip file, see your operating system’s documentation.)

  3. In the extracted content, a folder appears with the same name as the .zip file. Inside of this folder is the Databricks CLI executable. You can leave the Databricks CLI executable there, or you can copy or move it to another location.

Update the CLI

  1. Optionally, delete the Databricks CLI executable, the .zip file, and the .zip file’s extracted folder, from the previous procedure.

  2. Install the latest version of the Databricks CLI by following the instructions in the previous procedure.

To view the Databricks CLI executable’s version, run the version command or use the -v option:

databricks -v

Tip

If you try to run databricks but get an error such as command not found: databricks, either specify the path to the Databricks CLI executable, or omit the path and make sure that the Databricks CLI executable is referenced in your operating system’s PATH environment variable. To manage your PATH environment variable, see your operating system’s documentation.

Set up authentication

Before you can run Databricks CLI commands, you must set up authentication between the Databricks CLI and Databricks.

The Databricks CLI implements the Databricks client unified authentication standard, a consolidated and consistent architecural and programmatic approach to authentication. This approach helps make setting up and automating authentication with Databricks more centralized and predictable. It enables you to configure Databricks authentication once and then use that configuration across multiple Databricks tools and SDKs without further authentication configuration changes. See Databricks client unified authentication.

You must authenticate the Databricks CLI to the relevant resources at run time in order to run Databricks automation commands within a Databricks account or workspace. Depending on whether you want to call Databricks workspace-level REST APIs, Databricks account-level REST APIs, or both, you must authenticate to the Databricks workspace, account, or both.

The following sections provide information about setting up authenticate between the Databricks CLI and Databricks:

Token authentication

Token authentication uses a Databricks personal access token to authenticate the target Databricks entity, such as a Databricks user account. See Token authentication.

To create a Databricks personal access token, see Databricks personal access tokens.

Note

You cannot use token authentication for authenticating with a Databricks account, as Databricks account-level operations do not use Databricks personal access tokens for authentication. To authenticate with a Databricks account, consider using one of the following authentication types instead:

To configure and use token authentication, do the following:

  1. Create or identify a Databricks configuration profile with the following fields in your .databrickscfg file. If you create the profile, replace the placeholders with the appropriate values.

  2. Use the Databricks CLI’s --profile or -p option followed by the name of your configuration profile, as part of the Databricks CLI command call, for example databricks clusters list -p <configuration-profile-name>.

[<some-unique-configuration-profile-name>]
host  = <workspace-url>
token = <token>

Tip

For workspace-level operations, for token authentication only, you can use the Databricks CLI to create a configuration profile instead of manually creating one. To do this, use the Databricks CLI to run the configure command as follows:

databricks configure --host <workspace-url> -t -p <some-unique-configuration-profile-name>

For <workspace-url>, enter https:// followed by your instance name, for example https://<prefix>.cloud.databricks.com. To get your instance name, see Workspace instance names, URLs, and IDs.

If you try to run databricks but get an error such as command not found: databricks, either specify the path to the Databricks CLI executable, or omit the path and make sure that the Databricks CLI executable is referenced in your operating system’s PATH environment variable. To manage your PATH environment variable, see your operating system’s documentation.

The command prompts you to enter your access token that maps to the specified <workspace-url>:

✔ Databricks Token:

After you enter your access token, a corresponding configuration profile is added to your .databrickscfg file.

Basic authentication

Basic authentication uses a Databricks username and password to authenticate the target Databricks user account. See Basic authentication.

To configure and use basic authentication, do the following:

  1. Create or identify a Databricks configuration profile with the following fields in your .databrickscfg file. If you create the profile, replace the placeholders with the appropriate values.

  2. Use the Databricks CLI’s --profile or -p option followed by the name of your configuration profile, as part of the Databricks CLI command call, for example databricks clusters list -p <configuration-profile-name>.

For account-level commands, set the following values in your .databrickscfg file:

[<some-unique-configuration-profile-name>]
host       = <account-console-url>
account_id = <account-id>
username   = <username>
password   = <password>

For workspace-level commands, set the following values in your .databrickscfg file:

[<some-unique-configuration-profile-name>]
host       = <workspace-url>
username   = <username>
password   = <password>

OAuth machine-to-machine (M2M) authentication

Instead of authenticating with Databricks by using token authentication, you can use OAuth authentication. OAuth provides tokens with faster expiration times than Databricks personal access tokens, and offers better server-side session invalidation and scoping. Because OAuth access tokens expire in less than an hour, this reduces the risk associated with accidentally checking tokens into source control. See OAuth machine-to-machine (M2M) authentication.

To configure and use OAuth M2M authentication, do the following:

  1. Complete the OAuth M2M authentication setup instructions. See Steps 1–3 in Authentication using OAuth tokens for service principals.

    Important

    You only need to complete Steps 1–3 in the preceding article’s instructions.

    Step 4 in that article covers manually creating OAuth access tokens; however, the Databricks CLI automatically creates and manages OAuth access tokens for your target Databricks service principal on your behalf. Step 5 in that article covers using curl to call the Databricks REST API, instead of using the Databricks CLI.

  2. Create or identify a Databricks configuration profile with the following fields in your .databrickscfg file. If you create the profile, replace the placeholders with the appropriate values.

  3. Use the Databricks CLI’s --profile or -p option followed by the name of your configuration profile, as part of the Databricks CLI command call, for example databricks clusters list -p <configuration-profile-name>.

For account-level commands, set the following values in your .databrickscfg file:

[<some-unique-configuration-profile-name>]
host          = <account-console-url>
account_id    = <account-id>
client_id     = <service-principal-client-id>
client_secret = <service-principal-oauth-secret>

For workspace-level commands, set the following values in your .databrickscfg file:

[<some-unique-configuration-profile-name>]
host          = <workspace-url>
client_id     = <service-principal-client-id>
client_secret = <service-principal-oauth-secret>

OAuth user-to-machine (U2M) authentication

Instead of authenticating with Databricks by using token authentication, you can use OAuth authentication. OAuth provides tokens with faster expiration times than Databricks personal access tokens, and offers better server-side session invalidation and scoping. Because OAuth access tokens expire in less than an hour, this reduces the risk associated with accidentally checking tokens into source control. See OAuth user-to-machine (U2M) authentication.

To configure and use OAuth U2M authentication, do the following:

  1. Complete the OAuth U2M authentication setup instructions. See OAuth user-to-machine (U2M) authentication.

  2. For account-level commands, initiate OAuth token management locally by running the following command. Replace the placeholders with the appropriate values. When prompted, complete the on-screen instructions.

    databricks auth login --host <account-console-url> --account-id <account-id>
    
  3. For workspace-level commands, initiate OAuth token management locally by running the following command. Replace the placeholder with the appropriate value. When prompted, complete the on-screen instructions.

    databricks auth login --host <workspace-url>
    
  4. Create or identify a Databricks configuration profile with the following fields in your .databrickscfg file. If you create the profile, replace the placeholders with the appropriate values.

  5. Use the Databricks CLI’s --profile or -p option followed by the name of your configuration profile, as part of the Databricks CLI command call, for example databricks clusters list -p <configuration-profile-name>.

For account-level commands, set the following values in your .databrickscfg file:

[<some-unique-configuration-profile-name>]
host       = <account-console-url>
account_id = <account-id>

For workspace-level commands, set the following values in your .databrickscfg file:

[<some-unique-configuration-profile-name>]
host = <workspace-url>

Get information about configuration profiles

Adding multiple connection profiles to the .databrickscfg file enables you to quickly run commands across various workspaces by specifying the target connection profile’s name in the command’s --profile or -p option, for those commands that support this option. If you do not specify the --profile or -p option in a command that supports this option, the command will use the DEFAULT connection profile by default.

For example, you could have a connection profile named DEV that references a Databricks workspace that you use for development workloads and a separate connection profile named PROD that references a different Databricks workspace that you use for production workloads.

You can change the default path of the .databrickscfg file by setting the environment variable DATABRICKS_CONFIG_FILE. To learn how to set environment variables, see your operating system’s documentation.

To get information about an existing configuration profile, run the auth env command:

databricks auth env -p <profile-name>

# Or:
databricks auth env --host <workspace-url>

Tip

If you try to run databricks but get an error such as command not found: databricks, either specify the path to the Databricks CLI executable, or omit the path and make sure that the Databricks CLI executable is referenced in your operating system’s PATH environment variable. To manage your PATH environment variable, see your operating system’s documentation.

For example, here is the output for a profile that is configured with Databricks access token authentication:

{
  "env": {
    "DATABRICKS_AUTH_TYPE": "pat",
    "DATABRICKS_CONFIG_PROFILE": "<profile-name>",
    "DATABRICKS_HOST": "<workspace-url>",
    "DATABRICKS_TOKEN": "<token-value>"
  }
}

To get information about all available profiles, run the auth profiles command:

databricks auth profiles

Output (the ellipses represent omitted content, for brevity):

{
  "profiles": [
    {
      "name": "<profile-name>",
      "host": "<workspace-url>",
      "cloud": "<cloud-id>",
      "auth_type": "<auth-type>",
      "valid": true
    },
    {
      "...": "..."
    }
  ]
}

The output of the auth profiles command does not display any access tokens. To display an access token, run the preceding auth env command.

Important

The Databricks CLI does not work with a .netrc file. You can have a .netrc file in your environment for other purposes, but the Databricks CLI will not use that .netrc file.

Test your DEFAULT configuration profile setup

To check whether you set up authentication correctly, you can run a command such as the following, which lists the available Databricks Runtime versions for the Databricks workspace that is associated with your DEFAULT profile.

The following call assumes that you do not have any special environment variables set, which take precedence over the settings in your DEFAULT profile. For more information, see the api command group reference.

databricks clusters spark-versions

Tip

If you try to run databricks but get an error such as command not found: databricks, either specify the path to the Databricks CLI executable, or omit the path and make sure that the Databricks CLI executable is referenced in your operating system’s PATH environment variable. To manage your PATH environment variable, see your operating system’s documentation.

Test your connection profiles

To check whether you set up any connection profiles correctly, you can run a command such as the following with one of your connection profile names. This command lists the available Databricks Runtime versions for the Databricks workspace that is associated with the specified connection profile, represented here by the placeholder <profile-name>:

databricks clusters spark-versions -p <profile-name>

To list details for a specific profile, run the following command:

databricks auth env -p <profile-name>

To list details for all of your available profiles, run the following command:

databricks auth profiles

Authentication order of evaluation

Whenever the Databricks CLI needs to gather the settings that are required to attempt to authenticate with a Databricks workspace or account, it searches for these settings in the following locations, in the following order.

  1. For bundle commands, the values of fields within a project’s bundle setting files. (Bundle setting files do not support the direct inclusion of access credential values.)

  2. The values of environment variables, as listed within this article and in Environment variables and fields for client unified authentication.

  3. Configuration profile field values within the .databrickscfg file, as listed previously within this article.

Whenever the Databricks CLI finds the required settings that it needs, it stops searching in other locations. For example:

  • The Databricks CLI needs the value of a Databricks personal access token. A DATABRICKS_TOKEN environment variable is set, and the .databrickscfg file also contains multiple personal access tokens. In this example, the Databricks CLI uses the value of the DATABRICKS_TOKEN environment variable and does not search the .databrickscfg file.

  • The databricks bundle deploy -e development command needs the value of a Databricks personal access token. A DATABRICKS_TOKEN environment variable is not set, and the .databrickscfg file contains multiple personal access tokens. The project’s bundle settings file contains a development environment declaration that references through its profile field a configuration profile named DEV. In this example, the Databricks CLI searches the .databrickscfg file for a profile named DEV and uses the value of that profile’s token field.

  • The databricks bundle run -e development hello-job command needs the value of a Databricks personal access token. A DATABRICKS_TOKEN environment variable is not set, and the .databrickscfg file contains multiple personal access tokens. The project’s bundle settings file contains a development environment declaration that references through its host field a specific Databricks workspace URL. In this example, the Databricks CLI searches through the configuration profiles within the .databrickscfg file for a profile that contains a host field with a matching workspace URL. The Databricks CLI finds a matching host field and then uses that profile’s token field value.

Use the CLI

This section shows you how to list Databricks CLI command groups and commands, display Databricks CLI help, and work with Databricks CLI output.

List CLI command groups

You list the command groups by using the --help or -h option. For example:

databricks -h

List CLI commands

You list the commands for any command group by using the --help or -h option. For example, to list the clusters commands:

databricks clusters -h

Display CLI command help

You display the help for a command by using the --help or -h option. For example, to display the help for the clusters list command:

databricks clusters list -h

Use jq to parse CLI output

Some Databricks CLI commands output responses are formatted as JSON. In many cases, the Databricks CLI formats the JSON output so that it is easier to read. However, sometimes it can be useful to parse out parts of the JSON instead of listing the entire response. For example, to list just the display name of a Databricks cluster with the specified cluster ID, you can use the utility jq:

databricks clusters get 1234-567890-abcde123 | jq -r .cluster_name

Output:

My-11.3-LTS-Cluster

You can install jq for example on macOS by using Homebrew with brew install jq or on Windows by using Chocolatey with choco install jq. For more information on jq, see the jq Manual.

JSON string parameters

The format of string parameters is handled differently in JSON depending on your operating system:

You must enclose JSON string parameters in double quotes, and you must enclose the entire JSON payload in single quotes. Some examples:

'{"cluster_id": "1234-567890-abcde123"}'
'["20230323", "Amsterdam"]'

You must enclose JSON string parameters and the entire JSON payload in double quotes, and the double-quote characters inside the JSON payload must be preceded by \. Some examples:

"{\"cluster_id\": \"1234-567890-abcde123\"}"
"[\"20230323\", \"Amsterdam\"]"

Global flags

The following flags are available to all Databricks CLI commands. Note that some flags do not apply to some commands. For more information, see the command’s documentation.

Flag

Description

-h or --help

Display help for the Databricks CLI or the related command group or the related command.

-e or --environment string

A string representing the bundle environment to use if applicable for the related command.

--log-file

A string representing the to write output logs to. If this flag is not specified then the default is to write output logs to stderr.

--log-format

text to write output logs to text or json to write output logs to JSON. If this flag is not specified then output logs are written as text.

--log-level

A string representing the log format level. If not specified then the log format level is disabled.

-o or --output

text to write output as text or json to write output as JSON. If this flag is not specified then output is written as text.

-p or --profile

A string representing the named configuration profile to use within your .databrickscfg file. If this flag is not specified then the DEFAULT named profile is used if one exists.

--progress-format

The format for progress logs to display (default (the default) or append or inplace or json).

CLI command groups

The Databricks CLI includes the following command groups.

Help for these command groups is included within the Databricks CLI. To display help for a command group, run databricks <command-group> -h. To display help for a command, run databricks <command-group> <command-name> -h.

Command group

Area

account

Databricks account operations.

alerts

Databricks SQL alerts operations.

auth

Manage Databricks CLI authentication settings.

bundle

Databricks application bundle operations.

catalogs

Unity Catalog catalog operations.

cluster-policies

Cluster policy operations.

clusters

Cluster operations.

completion

Enable Databricks CLI autocompletion.

configure

Manage Databricks CLI configuration.

current-user

Get information about the current authenticated Databricks user or Databricks service principal.

dashboards

Dashboard operations.

data-sources

List data source connection information for available Databricks SQL warhouses.

experiments

MLflow experiment operations.

external-locations

Unity Catalog external location operations.

functions

Unity Catalog user-defined function (UDF) operations.

git-credentials

Git credentials operations.

global-init-scripts

Global init scripts operations.

grants

Unity Catalog access grant operations.

groups

Databricks workspace group operations.

help

Display help for the Databricks CLI or the related command group or the related command.

instance-pools

Instance pool operations.

instance-profiles

Instance profile operations.

ip-access-lists

IP access list operations.

jobs

Databricks job operations.

libraries

Library operations.

metastores

Unity Catalog metastore operations.

model-registry

Model registry operations.

permissions

Databricks object and endpoint permission operations.

pipelines

Delta Live Tables pipeline operations.

policy-families

Cluster policy family operations.

providers

Delta Sharing provider operations.

queries

Databricks SQL query operations.

query-history

Databricks SQL query history operations.

recipient-activation

Delta Sharing receipient activation operations.

recipients

Delta Sharing recipient operations.

repos

Databricks Repos operations.

schemas

Unity Catalog schema operations.

secrets

Databricks secrets operations.

service-principals

Databricks service principal operations.

serving-endpoints

Model serving endpoint operations.

shares

Delta Sharing share operations.

storage-credentials

Unity Catalog storage credential operations.

sync

Perform one-way synchronization of file changes within a local filesystem directory to a directory within a remote Databricks workspace.

table-constraints

Table constraint operations.

tables

Unity Catalog table operations.

token-management

Databricks personal access token management operations.

tokens

Databricks personal access token operations.

users

Databricks user operations.

version

Displays the Databricks CLI version.

warehouses

Databricks SQL warehouse operations.

workspace

Databricks workspace operations for notebooks and folders.

workspace-conf

Databricks workspace settings configuration operations.

api

Call any available Databricks REST API endpoint. You should use this command group only for advanced scenarios such as preview releases of specific Databricks REST APIs for which the Databricks CLI does not already wrap the target Databricks REST API within a related command.