Databricks SDK for Go
In this article, you learn how to automate operations in Databricks accounts, workspaces, and related resources with the Databricks SDK for Go.
Experimental
The Databricks SDK for Go is in an Experimental state. To provide feedback, ask questions, and report issues, use the Issues tab in the Databricks SDK for Go repository in GitHub.
During the Experimental period, Databricks is actively working on stabilizing the Databricks SDK for Go’s interfaces. API clients for all services are generated from specification files that are synchronized from the Databricks platform. You are highly encouraged to pin the exact version in the go.mod
file that you want to use and to read the CHANGELOG where Databricks documents the changes to each version. Some interfaces are more stable than others. For those interfaces that are not yet nightly tested, Databricks may have minor documented backward-incompatible changes, such as fixing mapping correctness from int
to int64
or renaming the methods or some type names to bring more consistency.
Before you begin
Before you begin to use the Databricks SDK for Go, your development machine must have:
Go installed.
Databricks authentication configured.
Get started with the Databricks SDK for Go
On your development machine with Go already installed, an existing Go code project already created, and Databricks authentication configured, create a
go.mod
file to track your Go code’s dependencies by running thego mod init
command, for example:go mod init sample
Take a dependency on the Databricks SDK for Go package by running the
go mod edit -require
command, replacing0.8.0
with the latest version of the Databricks SDK for Go package as listed in the CHANGELOG:go mod edit -require github.com/databricks/databricks-sdk-go@v0.8.0
Your
go.mod
file should now look like this:module sample go 1.18 require github.com/databricks/databricks-sdk-go v0.8.0
Within your project, create a Go code file that imports the Databricks SDK for Go. The following example, in a file named
main.go
with the following contents, lists all the clusters in your Databricks workspace:package main import ( "context" "github.com/databricks/databricks-sdk-go" "github.com/databricks/databricks-sdk-go/service/compute" ) func main() { w := databricks.Must(databricks.NewWorkspaceClient()) all, err := w.Clusters.ListAll(context.Background(), compute.ListClustersRequest{}) if err != nil { panic(err) } for _, c := range all { println(c.ClusterName) } }
Add any missing module dependencies by running the
go mod tidy
command:go mod tidy
Note
If you get the error
go: warning: "all" matched no packages
, you forgot to add a Go code file that imports the Databricks SDK for Go.Grab copies of all packages needed to support builds and tests of packages in your
main
module, by running thego mod vendor
command:go mod vendor
Set up your development machine for Databricks authentication.
Run your Go code file, assuming a file named
main.go
, by running thego run
command:go run main.go
Note
By not setting
*databricks.Config
as an argument in the preceding call tow := databricks.Must(databricks.NewWorkspaceClient())
, the Databricks SDK for Go uses its default process for trying to perform Databricks authentication. To override this default behavior, see Authenticate the Databricks SDK for Go with your Databricks account or workspace.
Authenticate the Databricks SDK for Go with your Databricks account or workspace
The Databricks SDK for Go implements the Databricks client unified authentication standard, a consolidated and consistent architectural and programmatic approach to authentication. This approach helps make setting up and automating authentication with Databricks more centralized and predictable. It enables you to configure Databricks authentication once and then use that configuration across multiple Databricks tools and SDKs without further authentication configuration changes. For more information, including more complete code examples in Go, see Databricks client unified authentication.
Some of the available coding patterns to initialize Databricks authentication with the Databricks SDK for Go include:
Use Databricks default authentication by doing one of the following:
Create or identify a custom Databricks configuration profile with the required fields for the target Databricks authentication type. Then set the
DATABRICKS_CONFIG_PROFILE
environment variable to the name of the custom configuration profile.Set the required environment variables for the target Databricks authentication type.
import ( "github.com/databricks/databricks-sdk-go" ) // ... w := databricks.Must(databricks.NewWorkspaceClient())
Hard-coding the required fields is supported but not recommended, as it risks exposing sensitive information in your code, such as Databricks personal access tokens. The following example hard-codes Databricks host and access token values for Databricks token authentication:
import ( "github.com/databricks/databricks-sdk-go" "github.com/databricks/databricks-sdk-go/config" ) // ... w := databricks.Must(databricks.NewWorkspaceClient(&databricks.Config{ Host: "https://...", Token: "...", }))
Examples
The following code examples demonstrate how to use the Databricks SDK for Go to create and delete clusters, run jobs, and list account users. These code examples use the Databricks SDK for Go’s default Databricks authentication process.
For additional code examples, see the examples folder in the Databricks SDK for Go repository in GitHub.
Create a cluster
This code example creates a cluster with the latest available Databricks Runtime Long Term Support (LTS) version and the smallest available cluster node type with a local disk. This cluster has one worker, and the cluster will automatically terminate after 15 minutes of idle time. The CreateAndWait
method call causes the code to pause until the new cluster is running in the workspace.
package main
import (
"context"
"fmt"
"github.com/databricks/databricks-sdk-go"
"github.com/databricks/databricks-sdk-go/service/compute"
)
func main() {
const clusterName = "my-cluster"
const autoTerminationMinutes = 15
const numWorkers = 1
w := databricks.Must(databricks.NewWorkspaceClient())
ctx := context.Background()
// Get the full list of available Spark versions to choose from.
sparkVersions, err := w.Clusters.SparkVersions(ctx)
if err != nil {
panic(err)
}
// Choose the latest Long Term Support (LTS) version.
latestLTS, err := sparkVersions.Select(compute.SparkVersionRequest{
Latest: true,
LongTermSupport: true,
})
if err != nil {
panic(err)
}
// Get the list of available cluster node types to choose from.
nodeTypes, err := w.Clusters.ListNodeTypes(ctx)
if err != nil {
panic(err)
}
// Choose the smallest available cluster node type.
smallestWithLocalDisk, err := nodeTypes.Smallest(clusters.NodeTypeRequest{
LocalDisk: true,
})
if err != nil {
panic(err)
}
fmt.Println("Now attempting to create the cluster, please wait...")
runningCluster, err := w.Clusters.CreateAndWait(ctx, compute.CreateCluster{
ClusterName: clusterName,
SparkVersion: latestLTS,
NodeTypeId: smallestWithLocalDisk,
AutoterminationMinutes: autoTerminationMinutes,
NumWorkers: numWorkers,
})
if err != nil {
panic(err)
}
switch runningCluster.State {
case compute.StateRunning:
fmt.Printf("The cluster is now ready at %s#setting/clusters/%s/configuration\n",
w.Config.Host,
runningCluster.ClusterId,
)
default:
fmt.Printf("Cluster is not running or failed to create. %s", runningCluster.StateMessage)
}
// Output:
//
// Now attempting to create the cluster, please wait...
// The cluster is now ready at <workspace-host>#setting/clusters/<cluster-id>/configuration
}
Permanently delete a cluster
This code example permanently deletes the cluster with the specified cluster ID from the workspace.
package main
import (
"context"
"github.com/databricks/databricks-sdk-go"
"github.com/databricks/databricks-sdk-go/service/clusters"
)
func main() {
// Replace with your cluster's ID.
const clusterId = "1234-567890-ab123cd4"
w := databricks.Must(databricks.NewWorkspaceClient())
ctx := context.Background()
err := w.Clusters.PermanentDelete(ctx, compute.PermanentDeleteCluster{
ClusterId: clusterId,
})
if err != nil {
panic(err)
}
}
Run a job
This code example creates a Databricks job that runs the specified notebook on the specified cluster. As the code runs, it gets the existing notebook’s path, the existing cluster ID, and related job settings from the user at the terminal. The RunNowAndWait
method call causes the code to pause until the new job has finished running in the workspace.
package main
import (
"bufio"
"context"
"fmt"
"os"
"strings"
"github.com/databricks/databricks-sdk-go"
"github.com/databricks/databricks-sdk-go/service/jobs"
)
func main() {
w := databricks.Must(databricks.NewWorkspaceClient())
ctx := context.Background()
nt := jobs.NotebookTask{
NotebookPath: askFor("Workspace path of the notebook to run:"),
}
jobToRun, err := w.Jobs.Create(ctx, jobs.CreateJob{
Name: askFor("Some short name for the job:"),
Tasks: []jobs.JobTaskSettings{
{
Description: askFor("Some short description for the job:"),
TaskKey: askFor("Some key to apply to the job's tasks:"),
ExistingClusterId: askFor("ID of the existing cluster in the workspace to run the job on:"),
NotebookTask: &nt,
},
},
})
if err != nil {
panic(err)
}
fmt.Printf("Now attempting to run the job at %s/#job/%d, please wait...\n",
w.Config.Host,
jobToRun.JobId,
)
runningJob, err := w.Jobs.RunNowAndWait(ctx, jobs.RunNow{
JobId: jobToRun.JobId,
})
if err != nil {
panic(err)
}
fmt.Printf("View the job run results at %s/#job/%d/run/%d\n",
w.Config.Host,
runningJob.JobId,
runningJob.RunId,
)
// Output:
//
// Now attempting to run the job at <workspace-host>/#job/<job-id>, please wait...
// View the job run results at <workspace-host>/#job/<job-id>/run/<run-id>
}
// Get job settings from the user.
func askFor(prompt string) string {
var s string
r := bufio.NewReader(os.Stdin)
for {
fmt.Fprint(os.Stdout, prompt+" ")
s, _ = r.ReadString('\n')
if s != "" {
break
}
}
return strings.TrimSpace(s)
}
List account users
This code example lists the available users within a Databricks account.
package main
import (
"context"
"github.com/databricks/databricks-sdk-go"
"github.com/databricks/databricks-sdk-go/service/iam"
)
func main() {
a := databricks.Must(databricks.NewAccountClient())
all, err := a.Users.ListAll(context.Background(), iam.ListAccountUsersRequest{})
if err != nil {
panic(err)
}
for _, u := range all {
println(u.UserName)
}
}
Additional resources
For more information, see:
The Databricks SDK for Go repository in GitHub
Additional code examples for the Databricks SDK for Go in GitHub