Databricks SQL Driver for Go

The Databricks SQL Driver for Go is a Go library that allows you to use Go code to run SQL commands on Databricks compute resources.

Requirements

Connect with a DSN connection string

To access clusters and SQL warehouses, use sql.Open() to create a database handle through a data source name (DSN) connection string. This code example retrieves the DSN connection string from an environment variable named DATABRICKS_DSN:

package main

import (
  "database/sql"
  "os"
  _ "github.com/databricks/databricks-sql-go"
)

func main() {
  dsn := os.Getenv("DATABRICKS_DSN")

  if dsn == "" {
    panic("No connection string found. " +
          "Set the DATABRICKS_DSN environment variable, and try again.")
  }

  db, err := sql.Open("databricks", dsn)
  if err != nil {
    panic(err)
  }
  defer db.Close()

  if err := db.Ping(); err != nil {
    panic(err)
  }
}

To specify the DSN connection string in the correct format, use the following syntax, where:

  • [your token] is your Databricks personal access token from the requirements.

  • [Workspace hostname] is the Server Hostname value from the requirements.

  • [Port number] is the Port value from the requirements.

  • [Endpoint HTTP Path] is the HTTP Path value from the requirements.

  • param=value is one or more Optional parameters listed later in this article.

token:[your token]@[Workspace hostname]:[Port number][Endpoint HTTP Path]?param=value

For example, for a cluster:

token:dapi12345678901234567890123456789012@dbc-a1b2345c-d6e7.cloud.databricks.com:443/sql/protocolv1/o/1234567890123456/1234-567890-abcdefgh

For example, for a SQL warehouse:

token:dapi12345678901234567890123456789012@dbc-a1b2345c-d6e7.cloud.databricks.com:443/sql/1.0/endpoints/a1b234c5678901d2

Note

As a security best practice, you should not hard-code this DSN connection string into your Go code. Instead, you should retrieve this DSN connection string from a secure location. For example, the code example earlier in this article used an environment variable.

Optional parameters

  • Supported optional connection parameters can be specified in param=value and include:

    • catalog: Sets the initial catalog name in the session.

    • schema: Sets the initial schema name in the session.

    • maxRows: Sets up the maximum number of rows fetched per request. The default is 10000.

    • timeout: Adds the timeout (in seconds) for the server query execution. The default is no timeout.

    • userAgentEntry: Used to identify partners. For more information, see your partner’s documentation.

  • Supported optional session parameters can be specified in param=value and include:

    • ansi_mode: A Boolean string. true for session statements to adhere to rules specified by the ANSI SQL specification. The system default is false.

    • timezone: A string, for example America/Los_Angeles. Sets the timezone of the session. The system default is UTC.

For example, for a SQL warehouse:

token:dapi12345678901234567890123456789012@dbc-a1b2345c-d6e7.cloud.databricks.com:443/sql/1.0/endpoints/a1b234c5678901d2?catalog=hive_metastore&schema=example&maxRows=100&timeout=60&timezone=America/Sao_Paulo&ansi_mode=true

Connect with a new connector object

Alternatively, use sql.OpenDB() to create a database handle through a new connector object that is created with dbsql.NewConnector() (connecting to Databricks clusters and SQL warehouses with a new connector object requires v1.0.0 or higher of the Databricks SQL Driver for Go). For example:

package main

import (
  "database/sql"
  "os"
  "strconv"
  dbsql "github.com/databricks/databricks-sql-go"
)

func main() {
  port, err := strconv.Atoi(os.Getenv("DATABRICKS_PORT"))
  if err != nil {
    panic(err)
  }

  connector, err := dbsql.NewConnector(
    dbsql.WithAccessToken(os.Getenv("DATABRICKS_ACCESS_TOKEN")),
    dbsql.WithServerHostname(os.Getenv("DATABRICKS_HOST")),
    dbsql.WithPort(port),
    dbsql.WithHTTPPath(os.Getenv("DATABRICKS_HTTP_PATH")),
  )
  if err != nil {
    panic(err)
  }

  db := sql.OpenDB(connector)
  defer db.Close()

  if err := db.Ping(); err != nil {
    panic(err)
  }
}

Note

As a security best practice, you should not hard-code your access token, server hostname, port, and HTTP endpoint path strings into your Go code. Instead, you should retrieve these values from a secure location. For example, the preceding code uses environment variables.

Supported functional options include:

  • WithServerHostname(<Workspace hostname> string): The Server Hostname value from the requirements. Required.

  • WithPort(<Port> int): The server’s port number, typically 443. Required.

  • WithAccessToken(<Your token> string): Your Databricks personal access token from the requirements. Required.

  • WithHTTPPath(<Endpoint HTTP path> string): The HTTP Path value from the requirements. Requirements.

  • WithInitialNamespace(<catalog> string, <schema> string):The catalog and schema name in the session. Optional.

  • WithMaxRows(<max rows> int): The maximum number of rows fetched per request. The default is 10000. Optional.

  • WithSessionParams(<params-map> map[string]string): The session parameters including “timezone” and “ansi_mode”. Optional.

  • WithTimeout(<timeout> Duration). The timeout (in time.Duration) for the server query execution. The default is no timeout. Optional.

  • WithUserAgentEntry(<isv name + product name> string). Used to identify partners. For more information, see your partner’s documentation. Optional.

For example:

connector, err := dbsql.NewConnector(
  dbsql.WithServerHostname(os.Getenv("DATABRICKS_HOST")),
  dbsql.WithPort(os.Getenv("DATABRICKS_PORT")),
  dbsql.WithAccessToken(os.Getenv("DATABRICKS_ACCESS_TOKEN")),
  dbsql.WithHTTPPath(os.Getenv("DATABRICKS_HTTP_PATH")),
  dbsql.WithInitialNamespace("hive_metastore", "example"),
  dbsql.WithMaxRows(100),
  dbsql.SessionParams(map[string]string{"timezone": "America/Sao_Paulo", "ansi_mode": "true"}),
  dbsql.WithTimeout(time.Minute),
  dbsql.WithUserAgentEntry("example-user"),
)

Query data

The following code example demonstrates how to call the Databricks SQL Driver for Go to run a basic SQL query on a Databricks compute resource. This command returns the first two rows from the diamonds table.

The diamonds table is included in Sample datasets. This table is also featured in Tutorial: Query data with notebooks.

This code example retrieves the DSN connection string from an environment variable named DATABRICKS_DSN.

package main

import (
  "database/sql"
  "fmt"
  "os"

  _ "github.com/databricks/databricks-sql-go"
)

func main() {
  dsn := os.Getenv("DATABRICKS_DSN")

  if dsn == "" {
    panic("No connection string found." +
          "Set the DATABRICKS_DSN environment variable, and try again.")
  }

  db, err := sql.Open("databricks", dsn)

  if err != nil {
    panic(err)
  }

  defer db.Close()

  var (
    _c0     string
    carat   string
    cut     string
    color   string
    clarity string
    depth   string
    table   string
    price   string
    x       string
    y       string
    z       string
  )

  rows, err := db.Query("SELECT * FROM default.diamonds LIMIT 2")

  if err != nil {
    panic(err)
  }

  defer rows.Close()

  for rows.Next() {
    err := rows.Scan(&_c0,
      &carat,
      &cut,
      &color,
      &clarity,
      &depth,
      &table,
      &price,
      &x,
      &y,
      &z)

    if err != nil {
      panic(err)
    }

    fmt.Print(_c0, ",",
      carat, ",",
      cut, ",",
      color, ",",
      clarity, ",",
      depth, ",",
      table, ",",
      price, ",",
      x, ",",
      y, ",",
      z, "\n")
  }

  err = rows.Err()

  if err != nil {
    panic(err)
  }
}

Output:

1,0.23,Ideal,E,SI2,61.5,55,326,3.95,3.98,2.43
2,0.21,Premium,E,SI1,59.8,61,326,3.89,3.84,2.31

For additional examples, see the examples folder in the databricks/databricks-sql-go repository on GitHub.

Additional resources