Databricks SQL CLI

Note

This article covers the Databricks SQL CLI, which is provided as-is and is not supported by Databricks through customer technical support channels. Questions and feature requests can be communicated through the Issues page of the databricks/databricks-sql-cli repo on GitHub.

The Databricks SQL command line interface (Databricks SQL CLI) enables you to run SQL queries on your existing Databricks SQL warehouses from your terminal or Windows Command Prompt instead of from locations such as the Databricks SQL editor or a Databricks notebook. From the command line, you get productivity features such as suggestions and syntax highlighting.

Requirements

  • At least one Databricks SQL warehouse. Create a warehouse, if you do not already have one.

  • Python 3.7 or higher. To check whether you have Python installed, run the command python --version from your terminal or Command Prompt. (On some systems, you may need to enter python3 instead.) Install Python, if you do not have it already installed.

  • pip, the package installer for Python. Newer versions of Python install pip by default. To check whether you have pip installed, run the command pip --version from your terminal or Command Prompt. (On some systems, you may need to enter pip3 instead.) Install pip, if you do not have it already installed.

  • (Optional) A utility for creating and managing Python virtual environments, such as venv. Virtual environments help to ensure that you are using the correct versions of Python and the Databricks SQL CLI together. Setting up and using virtual environments is outside of the scope of this article. For more information, see Creating Virtual Environments.

Install the Databricks SQL CLI

After you meet the requirements, install the Databricks SQL CLI package from the Python Packaging Index (PyPI). You can use pip to install the Databricks SQL CLI package from PyPI by running pip with one of the following commands.

pip install databricks-sql-cli

# Or...

python -m pip install databricks-sql-cli

To upgrade a previously installed version of the Databricks SQL CLI, run pip with one of the following commands.

pip install databricks-sql-cli --upgrade

# Or...

python -m pip install databricks-sql-cli --upgrade

To check your installed version of the Databricks SQL CLI run pip with one of the following commands.

pip show databricks-sql-cli

# Or...

python -m pip show databricks-sql-cli

Authentication

To authenticate, you must provide the Databricks SQL CLI with your warehouse’s connection details. Specifically, you need the Server hostname and HTTP path values. You must also product the Databricks SQL CLI with the proper authentication credentials.

The Databricks SQL CLI supports two Databricks authentication types: Databricks personal access token authentication and, for Databricks SQL CLI versions 0.2.0 and above, OAuth user-to-machine (U2M) authentication. Databricks recommends you use OAuth for greaater security and ease of use.

To use Databricks personal access token authentication, you must create a personal access token. For details on this process, see Databricks personal access token authentication.

There are no setup requirements to use OAuth U2M authentication.

You can provide this authentication information to the Databricks SQL CLI in several ways:

  • In the dbsqlclirc settings file in its default location (or by specifying an alternate settings file through the --clirc option each time you run a command with the Databricks SQL CLI). See Settings file.

  • For Databricks personal access token authentication, by setting the DBSQLCLI_HOST_NAME, DBSQLCLI_HTTP_PATH and DBSQLCLI_ACCESS_TOKEN environment variables. See Environment variables.

  • For Databricks OAuth U2M authentication, by setting the DBSQLCLI_HOST_NAME and DBSQLCLI_HTTP_PATH environment variables, and specifying the --oauth command-line option or setting auth_type = "databricks-oauth" in the dbsqlclirc settings file. See Environment variables.

  • For Databricks personal access token authentication, by specifying the --hostname, --http-path, and --access-token options each time you run a command with the Databricks SQL CLI. See Command options.

  • For Databricks OAuth U2M authentication, by specifying the --hostname and --http-path command-line options, and specifying the --oauth command-line option or setting auth_type = "databricks-oauth" in the dbsqlclirc settings file, each time you run a command with the Databricks SQL CLI. See Command options.

Note

The dbsqlclirc settings file must be present, even if you set the preceding environment variables or specify the preceding command options or both.

Whenever you run the Databricks SQL CLI, it looks for authentication details in the following order and stops when it finds the first set of details:

  1. The --hostname, --http-path, and --access-token or --oauth options.

  2. The DBSQLCLI_HOST_NAME and DBSQLCLI_HTTP_PATH environment variables (and, for Databricks personal access token authentication, the DBSQLCLI_ACCESS_TOKEN environment variable).

  3. The dbsqlclirc settings file in its default location (or an alternate settings file specified by the --clirc option).

Settings file

To use the dbsqlclirc settings file to provide the Databricks SQL CLI with authentication details for your Databricks SQL warehouse, run the Databricks SQL CLI for the first time, as follows:

dbsqlcli

The Databricks SQL CLI creates a settings file for you, at ~/.dbsqlcli/dbsqlclirc on Unix, Linux, and macOS, and at %HOMEDRIVE%%HOMEPATH%\.dbsqlcli\dbsqlclirc or %USERPROFILE%\.dbsqlcli\dbsqlclirc on Windows. To customize this file:

  1. Use a text editor to open and edit the dbsqlclirc file.

  2. Scroll to the following section:

    # [credentials]
    # host_name = ""
    # http_path = ""
    # access_token = ""
    
  3. Remove the four # characters, and:

    1. Next to host_name, enter your warehouse’s Server hostname value from the requirements between the "" characters.

    2. Next to http_path, enter your warehouse’s HTTP path value from the requirements between the "" characters.

    3. Next to access_token, enter your personal access token value from the requirements between the "" characters.

      Note

      For Databricks OAuth U2M authentication, you must replace access_token with auth_type = "databricks-oauth", or specify the --oauth command-line option with every call to the Databricks SQL CLI.

    For example:

    [credentials]
    host_name = "dbc-a1b2345c-d6e78.cloud.databricks.com"
    http_path = "/sql/1.0/warehouses/1abc2d3456e7f890a"
    access_token = "dapi12345678901234567890123456789012"
    
  4. Save the dbsqlclirc file.

Alternatively, instead of using the dbsqlclirc file in its default location, you can specify a file in a different location by adding the --clirc command option and the path to the alternate file. That alternate file’s contents must conform to the preceding syntax.

Environment variables

To use the DBSQLCLI_HOST_NAME and DBSQLCLI_HTTP_PATH environment variables (and, for Databricks personal access token authentication, the DBSQLCLI_ACCESS_TOKEN environment variable) to provide the Databricks SQL CLI with authentication details for your Databricks SQL warehouse, do the following.

Note

For Databricks OAuth U2M authentication, you must set auth_type = "databricks-oauth" in the dbsqlclirc settings file, or specify the --oauth command option with every call to the Databricks SQL CLI.

To set the environment variables for only the current terminal session, run the following commands. To set the environment variables for all terminal sessions, enter the following commands into your shell’s startup file and then restart your terminal. In the following commands, replace the value of:

  • DBSQLCLI_HOST_NAME with your warehouse’s Server hostname value from the requirements.

  • DBSQLCLI_HTTP_PATH with your warehouse’s HTTP path value from the requirements.

  • DBSQLCLI_ACCESS_TOKEN with your personal access token value from the requirements.

export DBSQLCLI_HOST_NAME="dbc-a1b2345c-d6e78.cloud.databricks.com"
export DBSQLCLI_HTTP_PATH="/sql/1.0/warehouses/1abc2d3456e7f890a"
export DBSQLCLI_ACCESS_TOKEN="dapi12345678901234567890123456789012"

To set the environment variables for only the current Command Prompt session, run the following commands, replacing the value of:

  • DBSQLCLI_HOST_NAME with your warehouse’s Server hostname value from the requirements.

  • DBSQLCLI_HTTP_PATH with your warehouse’s HTTP path value from the requirements.

  • DBSQLCLI_ACCESS_TOKEN with your personal access token value from the requirements.:

set DBSQLCLI_HOST_NAME="dbc-a1b2345c-d6e78.cloud.databricks.com"
set DBSQLCLI_HTTP_PATH="/sql/1.0/warehouses/1abc2d3456e7f890a"
set DBSQLCLI_ACCESS_TOKEN="dapi12345678901234567890123456789012"

To set the environment variables for all Command Prompt sessions, run the following commands and then restart your Command Prompt, replacing the value of:

  • DBSQLCLI_HOST_NAME with your warehouse’s Server hostname value from the requirements.

  • DBSQLCLI_HTTP_PATH with your warehouse’s HTTP path value from the requirements.

  • DBSQLCLI_ACCESS_TOKEN with your personal access token value from the requirements.

setx DBSQLCLI_HOST_NAME "dbc-a1b2345c-d6e78.cloud.databricks.com"
setx DBSQLCLI_HTTP_PATH "/sql/1.0/warehouses/1abc2d3456e7f890a"
setx DBSQLCLI_ACCESS_TOKEN "dapi12345678901234567890123456789012"

Command options

To use the --hostname, --http-path, and --access-token or --oauth options to provide the Databricks SQL CLI with authentication details for your Databricks SQL warehouse, do the following:

Do the following every time you run a command with the Databricks SQL CLI:

  • Specify the --hostname option and your warehouse’s Server hostname value from the requirements.

  • Specify the --http-path option and your warehouse’s HTTP path value from the requirements.

  • For Databricks personal access token authentication, specify the --access-token option and your personal access token value from the requirements.

  • For Databricks OAuth U2M authentication, specify --oauth.

    Note

    For Databricks OAuth U2M authentication, you must specify the auth_type = "databricks-oauth" in the dbsqlclirc settings file, or specify the --oauth command option with every call to the Databricks SQL CLI.

For example:

For Databricks personal access token authentication:

dbsqlcli -e "SELECT * FROM default.diamonds LIMIT 2" \
--hostname "dbc-a1b2345c-d6e78.cloud.databricks.com" \
--http-path "/sql/1.0/warehouses/1abc2d3456e7f890a" \
--access-token "dapi12345678901234567890123456789012"

For Databricks OAuth U2M authentication:

dbsqlcli -e "SELECT * FROM default.diamonds LIMIT 2" \
--hostname "dbc-a1b2345c-d6e78.cloud.databricks.com" \
--http-path "/sql/1.0/warehouses/1abc2d3456e7f890a" \
--oauth

Query sources

The Databricks SQL CLI enables you to run queries in the following ways:

  • From a query string.

  • From a file.

  • In a read-evaluate-print loop (REPL) approach. This approach provides suggestions as you type.

Query string

To run a query as a string, use the -e option followed by the query, represented as a string. For example:

dbsqlcli -e "SELECT * FROM default.diamonds LIMIT 2"

Output:

_c0,carat,cut,color,clarity,depth,table,price,x,y,z
1,0.23,Ideal,E,SI2,61.5,55,326,3.95,3.98,2.43
2,0.21,Premium,E,SI1,59.8,61,326,3.89,3.84,2.31

To switch output formats, use the --table-format option along with a value such as ascii for ASCII table format, for example:

dbsqlcli -e "SELECT * FROM default.diamonds LIMIT 2" --table-format ascii

Output:

+-----+-------+---------+-------+---------+-------+-------+-------+------+------+------+
| _c0 | carat | cut     | color | clarity | depth | table | price | x    | y    | z    |
+-----+-------+---------+-------+---------+-------+-------+-------+------+------+------+
| 1   | 0.23  | Ideal   | E     | SI2     | 61.5  | 55    | 326   | 3.95 | 3.98 | 2.43 |
| 2   | 0.21  | Premium | E     | SI1     | 59.8  | 61    | 326   | 3.89 | 3.84 | 2.31 |
+-----+-------+---------+-------+---------+-------+-------+-------+------+------+------+

For a list of available output format values, see the comments for the table_format setting in the dbsqlclirc file.

File

To run a file that contains SQL, use the -e option followed by the path to a .sql file. For example:

dbsqlcli -e my-query.sql

Contents of the example my-query.sql file:

SELECT * FROM default.diamonds LIMIT 2;

Output:

_c0,carat,cut,color,clarity,depth,table,price,x,y,z
1,0.23,Ideal,E,SI2,61.5,55,326,3.95,3.98,2.43
2,0.21,Premium,E,SI1,59.8,61,326,3.89,3.84,2.31

To switch output formats, use the --table-format option along with a value such as ascii for ASCII table format, for example:

dbsqlcli -e my-query.sql --table-format ascii

Output:

+-----+-------+---------+-------+---------+-------+-------+-------+------+------+------+
| _c0 | carat | cut     | color | clarity | depth | table | price | x    | y    | z    |
+-----+-------+---------+-------+---------+-------+-------+-------+------+------+------+
| 1   | 0.23  | Ideal   | E     | SI2     | 61.5  | 55    | 326   | 3.95 | 3.98 | 2.43 |
| 2   | 0.21  | Premium | E     | SI1     | 59.8  | 61    | 326   | 3.89 | 3.84 | 2.31 |
+-----+-------+---------+-------+---------+-------+-------+-------+------+------+------+

For a list of available output format values, see the comments for the table_format setting in the dbsqlclirc file.

REPL

To enter read-evaluate-print loop (REPL) mode scoped to the default database, run the following command:

dbsqlcli

You can also enter REPL mode scoped to a specific database, by running the following command:

dbsqlcli <database-name>

For example:

dbsqlcli default

To exit REPL mode, run the following command:

exit

In REPL mode, you can use the following characters and keys:

  • Use the semicolon (;) to end a line.

  • Use F3 to toggle multiline mode.

  • Use the spacebar to show suggestions at the insertion point, if suggestions are not already displayed.

  • Use the up and down arrows to navigate suggestions.

  • Use the right arrow to complete the highlighted suggestion.

For example:

dbsqlcli default

hostname:default> SELECT * FROM diamonds LIMIT 2;

+-----+-------+---------+-------+---------+-------+-------+-------+------+------+------+
| _c0 | carat | cut     | color | clarity | depth | table | price | x    | y    | z    |
+-----+-------+---------+-------+---------+-------+-------+-------+------+------+------+
| 1   | 0.23  | Ideal   | E     | SI2     | 61.5  | 55    | 326   | 3.95 | 3.98 | 2.43 |
| 2   | 0.21  | Premium | E     | SI1     | 59.8  | 61    | 326   | 3.89 | 3.84 | 2.31 |
+-----+-------+---------+-------+---------+-------+-------+-------+------+------+------+

2 rows in set
Time: 0.703s

hostname:default> exit

Logging

The Databricks SQL CLI logs its messages to the file ~/.dbsqlcli/app.log by default. To change this file name or location, change the value of the log_file setting in the dbsqlclirc settings file.

By default, messages are logged at the INFO log level and below. To change this log level, change the value of the log_level setting in the dbsqlclirc settings file. Available log level values include CRITICAL, ERROR, WARNING, INFO, and DEBUG and are evaluated in that order. NONE disables logging.