Databricks SQL Driver for Go
The Databricks SQL Driver for Go is a Go library that allows you to use Go code to run SQL commands on Databricks compute resources.
Requirements
A development machine running Go, version 1.18 or higher. To print the installed version of Go, run the command
go version
. Download and install Go.An existing cluster or SQL warehouse.
The Server Hostname, Port, and HTTP Path values for the existing cluster or SQL warehouse.
A Databricks personal access token.
Connect with a DSN connection string
To access clusters and SQL warehouses, use sql.Open()
to create a database handle through a data source name (DSN) connection string. This code example retrieves the DSN connection string from an environment variable named DATABRICKS_DSN
:
package main
import (
"database/sql"
"os"
_ "github.com/databricks/databricks-sql-go"
)
func main() {
dsn := os.Getenv("DATABRICKS_DSN")
if dsn == "" {
panic("No connection string found. " +
"Set the DATABRICKS_DSN environment variable, and try again.")
}
db, err := sql.Open("databricks", dsn)
if err != nil {
panic(err)
}
defer db.Close()
if err := db.Ping(); err != nil {
panic(err)
}
}
To specify the DSN connection string in the correct format, use the following syntax, where:
[your token]
is your Databricks personal access token from the requirements.[Workspace hostname]
is the Server Hostname value from the requirements.[Port number]
is the Port value from the requirements.[Endpoint HTTP Path]
is the HTTP Path value from the requirements.param=value
is one or more Optional parameters listed later in this article.
token:[your token]@[Workspace hostname]:[Port number][Endpoint HTTP Path]?param=value
For example, for a cluster:
token:dapi12345678901234567890123456789012@dbc-a1b2345c-d6e7.cloud.databricks.com:443/sql/protocolv1/o/1234567890123456/1234-567890-abcdefgh
For example, for a SQL warehouse:
token:dapi12345678901234567890123456789012@dbc-a1b2345c-d6e7.cloud.databricks.com:443/sql/1.0/endpoints/a1b234c5678901d2
Note
As a security best practice, you should not hard-code this DSN connection string into your Go code. Instead, you should retrieve this DSN connection string from a secure location. For example, the code example earlier in this article used an environment variable.
Optional parameters
Supported optional connection parameters can be specified in
param=value
and include:catalog
: Sets the initial catalog name in the session.schema
: Sets the initial schema name in the session.maxRows
: Sets up the maximum number of rows fetched per request. The default is10000
.timeout
: Adds the timeout (in seconds) for the server query execution. The default is no timeout.userAgentEntry
: Used to identify partners. For more information, see your partner’s documentation.
Supported optional session parameters can be specified in
param=value
and include:ansi_mode
: A Boolean string.true
for session statements to adhere to rules specified by the ANSI SQL specification. The system default is false.timezone
: A string, for exampleAmerica/Los_Angeles
. Sets the timezone of the session. The system default is UTC.
For example, for a SQL warehouse:
token:dapi12345678901234567890123456789012@dbc-a1b2345c-d6e7.cloud.databricks.com:443/sql/1.0/endpoints/a1b234c5678901d2?catalog=hive_metastore&schema=example&maxRows=100&timeout=60&timezone=America/Sao_Paulo&ansi_mode=true
Connect with a new connector object
Alternatively, use sql.OpenDB()
to create a database handle through a new connector object that is created with dbsql.NewConnector()
(connecting to Databricks clusters and SQL warehouses with a new connector object requires v1.0.0 or higher of the Databricks SQL Driver for Go). For example:
package main
import (
"database/sql"
"os"
"strconv"
dbsql "github.com/databricks/databricks-sql-go"
)
func main() {
port, err := strconv.Atoi(os.Getenv("DATABRICKS_PORT"))
if err != nil {
panic(err)
}
connector, err := dbsql.NewConnector(
dbsql.WithAccessToken(os.Getenv("DATABRICKS_ACCESS_TOKEN")),
dbsql.WithServerHostname(os.Getenv("DATABRICKS_HOST")),
dbsql.WithPort(port),
dbsql.WithHTTPPath(os.Getenv("DATABRICKS_HTTP_PATH")),
)
if err != nil {
panic(err)
}
db := sql.OpenDB(connector)
defer db.Close()
if err := db.Ping(); err != nil {
panic(err)
}
}
Note
As a security best practice, you should not hard-code your access token, server hostname, port, and HTTP endpoint path strings into your Go code. Instead, you should retrieve these values from a secure location. For example, the preceding code uses environment variables.
Supported functional options include:
WithServerHostname(<Workspace hostname> string)
: The Server Hostname value from the requirements. Required.WithPort(<Port> int)
: The server’s port number, typically443
. Required.WithAccessToken(<Your token> string)
: Your Databricks personal access token from the requirements. Required.WithHTTPPath(<Endpoint HTTP path> string)
: The HTTP Path value from the requirements. Requirements.WithInitialNamespace(<catalog> string, <schema> string)
:The catalog and schema name in the session. Optional.WithMaxRows(<max rows> int)
: The maximum number of rows fetched per request. The default is10000.
Optional.WithSessionParams(<params-map> map[string]string)
: The session parameters including “timezone” and “ansi_mode”. Optional.WithTimeout(<timeout> Duration)
. The timeout (intime.Duration
) for the server query execution. The default is no timeout. Optional.WithUserAgentEntry(<isv name + product name> string)
. Used to identify partners. For more information, see your partner’s documentation. Optional.
For example:
connector, err := dbsql.NewConnector(
dbsql.WithServerHostname(os.Getenv("DATABRICKS_HOST")),
dbsql.WithPort(os.Getenv("DATABRICKS_PORT")),
dbsql.WithAccessToken(os.Getenv("DATABRICKS_ACCESS_TOKEN")),
dbsql.WithHTTPPath(os.Getenv("DATABRICKS_HTTP_PATH")),
dbsql.WithInitialNamespace("hive_metastore", "example"),
dbsql.WithMaxRows(100),
dbsql.SessionParams(map[string]string{"timezone": "America/Sao_Paulo", "ansi_mode": "true"}),
dbsql.WithTimeout(time.Minute),
dbsql.WithUserAgentEntry("example-user"),
)
Query data
The following code example demonstrates how to call the Databricks SQL Driver for Go to run a basic SQL query on a Databricks compute resource. This command returns the first two rows from the diamonds
table.
The diamonds
table is included in Sample datasets. This table is also featured in Tutorial: Query data with notebooks.
This code example retrieves the DSN connection string from an environment variable named DATABRICKS_DSN
.
package main
import (
"database/sql"
"fmt"
"os"
_ "github.com/databricks/databricks-sql-go"
)
func main() {
dsn := os.Getenv("DATABRICKS_DSN")
if dsn == "" {
panic("No connection string found." +
"Set the DATABRICKS_DSN environment variable, and try again.")
}
db, err := sql.Open("databricks", dsn)
if err != nil {
panic(err)
}
defer db.Close()
var (
_c0 string
carat string
cut string
color string
clarity string
depth string
table string
price string
x string
y string
z string
)
rows, err := db.Query("SELECT * FROM default.diamonds LIMIT 2")
if err != nil {
panic(err)
}
defer rows.Close()
for rows.Next() {
err := rows.Scan(&_c0,
&carat,
&cut,
&color,
&clarity,
&depth,
&table,
&price,
&x,
&y,
&z)
if err != nil {
panic(err)
}
fmt.Print(_c0, ",",
carat, ",",
cut, ",",
color, ",",
clarity, ",",
depth, ",",
table, ",",
price, ",",
x, ",",
y, ",",
z, "\n")
}
err = rows.Err()
if err != nil {
panic(err)
}
}
Output:
1,0.23,Ideal,E,SI2,61.5,55,326,3.95,3.98,2.43
2,0.21,Premium,E,SI1,59.8,61,326,3.89,3.84,2.31
For additional examples, see the examples folder in the databricks/databricks-sql-go repository on GitHub.
Additional resources
The Databricks SQL Driver for Go repository on GitHub
The database/sql package home page