Databricks Utilities

Databricks Utilities (DBUtils) make it easy to perform powerful combinations of tasks. You can use the utilities to work with blob storage efficiently, to chain and parameterize notebooks, and to work with secrets.

This topic includes the following sections:

File system utilities

The file system utilities access Databricks File System - DBFS, making it easier to use Databricks as a file system. Learn more by running:

dbutils.fs.help()

Notebook workflow utilities

Notebook workflows allow you to chain together notebooks and act on their results. See Notebook Workflows. Learn more by running:

dbutils.notebook.help()

Widget utilities

Widgets allow you to parameterize notebooks. See Widgets. Learn more by running:

dbutils.widgets.help()

Secrets utilities

Note

Secrets utilities are available only on clusters running Databricks Runtime 4.0 and above.

Secrets allow you to store and access sensitive credential information without making them visible in notebooks. See Secrets and Use the secret in a notebook. Learn more by running:

dbutils.secrets.help()

Databricks Utilities API library

To accelerate application development, it can be helpful to compile, build, and test applications before you deploy them as production jobs. To enable you to compile against Databricks Utilities, Databricks provides the dbutils-api library. You can download the dbutils-api library or include the library by adding a dependency to your build file:

  • SBT

    libraryDependencies += "com.databricks" % "dbutils-api_2.11" % "0.0.3"
    
  • Maven

    <dependency>
        <groupId>com.databricks</groupId>
        <artifactId>dbutils-api_2.11</artifactId>
        <version>0.0.3</version>
    </dependency>
    
  • Gradle

    compile 'com.databricks:dbutils-api_2.11:0.0.3'
    

Once you build your application against this library, you can deploy the application on a cluster running Databricks Runtime 4.0 and above.

Important

The dbutils-api library allows you to locally compile an application that uses dbutils, but not to run it. To run the application, you must deploy it in Databricks.

Example projects

Here is an example archive containing minimal example projects that show you how to compile using the dbutils-api library for 3 common build tools:

  • sbt: sbt package
  • Maven: mvn install
  • Gradle: gradle build

These commands create output JARs in the locations:

  • sbt: target/scala-2.11/dbutils-api-example_2.11-0.0.1-SNAPSHOT.jar
  • Maven: target/dbutils-api-example-0.0.1-SNAPSHOT.jar
  • Gradle: build/libs/dbutils-api-example-0.0.1-SNAPSHOT.jar

You can attach this JAR to your cluster as a library, restart the cluster (which you must do using Databricks Runtime 4.0), and then run:

example.Test()

This statement creates a text input widget with the label Hello: and the initial value World.

You can use all the other dbutils APIs the same way.

To test an application that uses the dbutils object outside Databricks, you can mock up the dbutils object by calling:

com.databricks.dbutils_v1.DBUtilsHolder.dbutils0.set(
  new com.databricks.dbutils_v1.DBUtilsV1{
    ...
  }
)

Substitute your own DBUtilsV1 instance in which you implement the interface methods however you like, for example providing a local filesystem mockup for dbutils.fs.