Databricks SQL CLI
Note
This article covers the Databricks SQL CLI, which is provided as-is and is not supported by Databricks through customer technical support channels. Questions and feature requests can be communicated through the Issues page of the databricks/databricks-sql-cli repo on GitHub.
The Databricks SQL command line interface (Databricks SQL CLI) enables you to run SQL queries on your existing Databricks SQL warehouses from your terminal or Windows Command Prompt instead of from locations such as the Databricks SQL editor or a Databricks notebook. From the command line, you get productivity features such as suggestions and syntax highlighting.
Requirements
At least one Databricks SQL warehouse. Create a warehouse, if you do not already have one.
Python 3.7 or higher. To check whether you have Python installed, run the command
python --version
from your terminal or Command Prompt. (On some systems, you may need to enterpython3
instead.) Install Python, if you do not have it already installed.pip, the package installer for Python. Newer versions of Python install
pip
by default. To check whether you havepip
installed, run the commandpip --version
from your terminal or Command Prompt. (On some systems, you may need to enterpip3
instead.) Install pip, if you do not have it already installed.(Optional) A utility for creating and managing Python virtual environments, such as venv. Virtual environments help to ensure that you are using the correct versions of Python and the Databricks SQL CLI together. Setting up and using virtual environments is outside of the scope of this article. For more information, see Creating Virtual Environments.
Install the Databricks SQL CLI
After you meet the requirements, install the Databricks SQL CLI package from the Python Packaging Index (PyPI). You can use pip
to install the Databricks SQL CLI package from PyPI by running pip
with one of the following commands.
pip install databricks-sql-cli
# Or...
python -m pip install databricks-sql-cli
To upgrade a previously installed version of the Databricks SQL CLI, run pip
with one of the following commands.
pip install databricks-sql-cli --upgrade
# Or...
python -m pip install databricks-sql-cli --upgrade
To check your installed version of the Databricks SQL CLI run pip
with one of the following commands.
pip show databricks-sql-cli
# Or...
python -m pip show databricks-sql-cli
Authentication
To authenticate, you must provide the Databricks SQL CLI with your warehouse’s connection details. Specifically, you need the Server hostname and HTTP path values. You must also product the Databricks SQL CLI with the proper authentication credentials.
The Databricks SQL CLI supports two Databricks authentication types: Databricks personal access token authentication and, for Databricks SQL CLI versions 0.2.0 and above, OAuth user-to-machine (U2M) authentication.
To use Databricks personal access token authentication, create a personal access token as follows:
In your Databricks workspace, click your Databricks username in the top bar, and then select Settings from the drop down.
Click Developer.
Next to Access tokens, click Manage.
Click Generate new token.
(Optional) Enter a comment that helps you to identify this token in the future, and change the token’s default lifetime of 90 days. To create a token with no lifetime (not recommended), leave the Lifetime (days) box empty (blank).
Click Generate.
Copy the displayed token to a secure location, and then click Done.
Note
Be sure to save the copied token in a secure location. Do not share your copied token with others. If you lose the copied token, you cannot regenerate that exact same token. Instead, you must repeat this procedure to create a new token. If you lose the copied token, or you believe that the token has been compromised, Databricks strongly recommends that you immediately delete that token from your workspace by clicking the trash can (Revoke) icon next to the token on the Access tokens page.
If you are not able to create or use tokens in your workspace, this might be because your workspace administrator has disabled tokens or has not given you permission to create or use tokens. See your workspace administrator or the following topics:
There are no setup requirements to use OAuth U2M authentication.
You can provide this authentication information to the Databricks SQL CLI in several ways:
In the
dbsqlclirc
settings file in its default location (or by specifying an alternate settings file through the--clirc
option each time you run a command with the Databricks SQL CLI). See Settings file.For Databricks personal access token authentication, by setting the
DBSQLCLI_HOST_NAME
,DBSQLCLI_HTTP_PATH
andDBSQLCLI_ACCESS_TOKEN
environment variables. See Environment variables.For Databricks OAuth U2M authentication, by setting the
DBSQLCLI_HOST_NAME
andDBSQLCLI_HTTP_PATH
environment variables, and specifying the--oauth
command-line option or settingauth_type = "databricks-oauth"
in thedbsqlclirc
settings file. See Environment variables.For Databricks personal access token authentication, by specifying the
--hostname
,--http-path
, and--access-token
options each time you run a command with the Databricks SQL CLI. See Command options.For Databricks OAuth U2M authentication, by specifying the
--hostname
and--http-path
command-line options, and specifying the--oauth
command-line option or settingauth_type = "databricks-oauth"
in thedbsqlclirc
settings file, each time you run a command with the Databricks SQL CLI. See Command options.
Note
The dbsqlclirc
settings file must be present, even if you set the preceding environment variables or specify the preceding command options or both.
Whenever you run the Databricks SQL CLI, it looks for authentication details in the following order and stops when it finds the first set of details:
The
--hostname
,--http-path
, and--access-token
or--oauth
options.The
DBSQLCLI_HOST_NAME
andDBSQLCLI_HTTP_PATH
environment variables (and, for Databricks personal access token authentication, theDBSQLCLI_ACCESS_TOKEN
environment variable).The
dbsqlclirc
settings file in its default location (or an alternate settings file specified by the--clirc
option).
Settings file
To use the dbsqlclirc
settings file to provide the Databricks SQL CLI with authentication details for your Databricks SQL warehouse, run the Databricks SQL CLI for the first time, as follows:
dbsqlcli
The Databricks SQL CLI creates a settings file for you, at ~/.dbsqlcli/dbsqlclirc
on Unix, Linux, and macOS, and at %HOMEDRIVE%%HOMEPATH%\.dbsqlcli\dbsqlclirc
or %USERPROFILE%\.dbsqlcli\dbsqlclirc
on Windows. To customize this file:
Use a text editor to open and edit the
dbsqlclirc
file.Scroll to the following section:
# [credentials] # host_name = "" # http_path = "" # access_token = ""
Remove the four
#
characters, and:Next to
host_name
, enter your warehouse’s Server hostname value from the requirements between the""
characters.Next to
http_path
, enter your warehouse’s HTTP path value from the requirements between the""
characters.Next to
access_token
, enter your personal access token value from the requirements between the""
characters.Note
For Databricks OAuth U2M authentication, you must replace
access_token
withauth_type = "databricks-oauth"
, or specify the--oauth
command-line option with every call to the Databricks SQL CLI.
For example:
[credentials] host_name = "dbc-a1b2345c-d6e78.cloud.databricks.com" http_path = "/sql/1.0/warehouses/1abc2d3456e7f890a" access_token = "dapi12345678901234567890123456789012"
Save the
dbsqlclirc
file.
Alternatively, instead of using the dbsqlclirc
file in its default location, you can specify a file in a different location by adding the --clirc
command option and the path to the alternate file. That alternate file’s contents must conform to the preceding syntax.
Environment variables
To use the DBSQLCLI_HOST_NAME
and DBSQLCLI_HTTP_PATH
environment variables (and, for Databricks personal access token authentication, the DBSQLCLI_ACCESS_TOKEN
environment variable) to provide the Databricks SQL CLI with authentication details for your Databricks SQL warehouse, do the following.
Note
For Databricks OAuth U2M authentication, you must set auth_type = "databricks-oauth"
in the dbsqlclirc
settings file, or specify the --oauth
command option with every call to the Databricks SQL CLI.
To set the environment variables for only the current terminal session, run the following commands. To set the environment variables for all terminal sessions, enter the following commands into your shell’s startup file and then restart your terminal. In the following commands, replace the value of:
DBSQLCLI_HOST_NAME
with your warehouse’s Server hostname value from the requirements.DBSQLCLI_HTTP_PATH
with your warehouse’s HTTP path value from the requirements.DBSQLCLI_ACCESS_TOKEN
with your personal access token value from the requirements.
export DBSQLCLI_HOST_NAME="dbc-a1b2345c-d6e78.cloud.databricks.com"
export DBSQLCLI_HTTP_PATH="/sql/1.0/warehouses/1abc2d3456e7f890a"
export DBSQLCLI_ACCESS_TOKEN="dapi12345678901234567890123456789012"
To set the environment variables for only the current Command Prompt session, run the following commands, replacing the value of:
DBSQLCLI_HOST_NAME
with your warehouse’s Server hostname value from the requirements.DBSQLCLI_HTTP_PATH
with your warehouse’s HTTP path value from the requirements.DBSQLCLI_ACCESS_TOKEN
with your personal access token value from the requirements.:
set DBSQLCLI_HOST_NAME="dbc-a1b2345c-d6e78.cloud.databricks.com"
set DBSQLCLI_HTTP_PATH="/sql/1.0/warehouses/1abc2d3456e7f890a"
set DBSQLCLI_ACCESS_TOKEN="dapi12345678901234567890123456789012"
To set the environment variables for all Command Prompt sessions, run the following commands and then restart your Command Prompt, replacing the value of:
DBSQLCLI_HOST_NAME
with your warehouse’s Server hostname value from the requirements.DBSQLCLI_HTTP_PATH
with your warehouse’s HTTP path value from the requirements.DBSQLCLI_ACCESS_TOKEN
with your personal access token value from the requirements.
setx DBSQLCLI_HOST_NAME "dbc-a1b2345c-d6e78.cloud.databricks.com"
setx DBSQLCLI_HTTP_PATH "/sql/1.0/warehouses/1abc2d3456e7f890a"
setx DBSQLCLI_ACCESS_TOKEN "dapi12345678901234567890123456789012"
Command options
To use the --hostname
, --http-path
, and --access-token
or --oauth
options to provide the Databricks SQL CLI with authentication details for your Databricks SQL warehouse, do the following:
Do the following every time you run a command with the Databricks SQL CLI:
Specify the
--hostname
option and your warehouse’s Server hostname value from the requirements.Specify the
--http-path
option and your warehouse’s HTTP path value from the requirements.For Databricks personal access token authentication, specify the
--access-token
option and your personal access token value from the requirements.For Databricks OAuth U2M authentication, specify
--oauth
.Note
For Databricks OAuth U2M authentication, you must specify the
auth_type = "databricks-oauth"
in thedbsqlclirc
settings file, or specify the--oauth
command option with every call to the Databricks SQL CLI.
For example:
For Databricks personal access token authentication:
dbsqlcli -e "SELECT * FROM default.diamonds LIMIT 2" \
--hostname "dbc-a1b2345c-d6e78.cloud.databricks.com" \
--http-path "/sql/1.0/warehouses/1abc2d3456e7f890a" \
--access-token "dapi12345678901234567890123456789012"
For Databricks OAuth U2M authentication:
dbsqlcli -e "SELECT * FROM default.diamonds LIMIT 2" \
--hostname "dbc-a1b2345c-d6e78.cloud.databricks.com" \
--http-path "/sql/1.0/warehouses/1abc2d3456e7f890a" \
--oauth
Query sources
The Databricks SQL CLI enables you to run queries in the following ways:
From a query string.
From a file.
In a read-evaluate-print loop (REPL) approach. This approach provides suggestions as you type.
Query string
To run a query as a string, use the -e
option followed by the query, represented as a string. For example:
dbsqlcli -e "SELECT * FROM default.diamonds LIMIT 2"
Output:
_c0,carat,cut,color,clarity,depth,table,price,x,y,z
1,0.23,Ideal,E,SI2,61.5,55,326,3.95,3.98,2.43
2,0.21,Premium,E,SI1,59.8,61,326,3.89,3.84,2.31
To switch output formats, use the --table-format
option along with a value such as ascii
for ASCII table format, for example:
dbsqlcli -e "SELECT * FROM default.diamonds LIMIT 2" --table-format ascii
Output:
+-----+-------+---------+-------+---------+-------+-------+-------+------+------+------+
| _c0 | carat | cut | color | clarity | depth | table | price | x | y | z |
+-----+-------+---------+-------+---------+-------+-------+-------+------+------+------+
| 1 | 0.23 | Ideal | E | SI2 | 61.5 | 55 | 326 | 3.95 | 3.98 | 2.43 |
| 2 | 0.21 | Premium | E | SI1 | 59.8 | 61 | 326 | 3.89 | 3.84 | 2.31 |
+-----+-------+---------+-------+---------+-------+-------+-------+------+------+------+
For a list of available output format values, see the comments for the table_format
setting in the dbsqlclirc
file.
File
To run a file that contains SQL, use the -e
option followed by the path to a .sql
file. For example:
dbsqlcli -e my-query.sql
Contents of the example my-query.sql
file:
SELECT * FROM default.diamonds LIMIT 2;
Output:
_c0,carat,cut,color,clarity,depth,table,price,x,y,z
1,0.23,Ideal,E,SI2,61.5,55,326,3.95,3.98,2.43
2,0.21,Premium,E,SI1,59.8,61,326,3.89,3.84,2.31
To switch output formats, use the --table-format
option along with a value such as ascii
for ASCII table format, for example:
dbsqlcli -e my-query.sql --table-format ascii
Output:
+-----+-------+---------+-------+---------+-------+-------+-------+------+------+------+
| _c0 | carat | cut | color | clarity | depth | table | price | x | y | z |
+-----+-------+---------+-------+---------+-------+-------+-------+------+------+------+
| 1 | 0.23 | Ideal | E | SI2 | 61.5 | 55 | 326 | 3.95 | 3.98 | 2.43 |
| 2 | 0.21 | Premium | E | SI1 | 59.8 | 61 | 326 | 3.89 | 3.84 | 2.31 |
+-----+-------+---------+-------+---------+-------+-------+-------+------+------+------+
For a list of available output format values, see the comments for the table_format
setting in the dbsqlclirc
file.
REPL
To enter read-evaluate-print loop (REPL) mode scoped to the default database, run the following command:
dbsqlcli
You can also enter REPL mode scoped to a specific database, by running the following command:
dbsqlcli <database-name>
For example:
dbsqlcli default
To exit REPL mode, run the following command:
exit
In REPL mode, you can use the following characters and keys:
Use the semicolon (
;
) to end a line.Use F3 to toggle multiline mode.
Use the spacebar to show suggestions at the insertion point, if suggestions are not already displayed.
Use the up and down arrows to navigate suggestions.
Use the right arrow to complete the highlighted suggestion.
For example:
dbsqlcli default
hostname:default> SELECT * FROM diamonds LIMIT 2;
+-----+-------+---------+-------+---------+-------+-------+-------+------+------+------+
| _c0 | carat | cut | color | clarity | depth | table | price | x | y | z |
+-----+-------+---------+-------+---------+-------+-------+-------+------+------+------+
| 1 | 0.23 | Ideal | E | SI2 | 61.5 | 55 | 326 | 3.95 | 3.98 | 2.43 |
| 2 | 0.21 | Premium | E | SI1 | 59.8 | 61 | 326 | 3.89 | 3.84 | 2.31 |
+-----+-------+---------+-------+---------+-------+-------+-------+------+------+------+
2 rows in set
Time: 0.703s
hostname:default> exit
Logging
The Databricks SQL CLI logs its messages to the file ~/.dbsqlcli/app.log
by default. To change this file name or location, change the value of the log_file
setting in the dbsqlclirc
settings file.
By default, messages are logged at the INFO
log level and below. To change this log level, change the value of the log_level
setting in the dbsqlclirc
settings file. Available log level values include CRITICAL
, ERROR
, WARNING
, INFO
, and DEBUG
and are evaluated in that order. NONE
disables logging.