Databricks Utilities

Databricks Utilities (DBUtils) make it easy to perform powerful combinations of tasks. You can use the utilities to work with blob storage efficiently, to chain and parameterize notebooks, and to work with secrets.

dbutils is available in Python and Scala notebooks. However, you can use a language magic command to invoke dbutils methods in R and SQL notebooks.

This topic includes the following sections:

File system utilities

The file system utilities access Databricks File System - DBFS, making it easier to use Databricks as a file system. Learn more by running:

dbutils.fs.help()
cp(from: String, to: String, recurse: boolean = false): boolean -> Copies a file or directory, possibly across FileSystems
head(file: String, maxBytes: int = 65536): String -> Returns up to the first 'maxBytes' bytes of the given file as a String encoded in UTF-8
ls(dir: String): Seq -> Lists the contents of a directory
mkdirs(dir: String): boolean -> Creates the given directory if it does not exist, also creating any necessary parent directories
mv(from: String, to: String, recurse: boolean = false): boolean -> Moves a file or directory, possibly across FileSystems
put(file: String, contents: String, overwrite: boolean = false): boolean -> Writes the given String out to a file, encoded in UTF-8
rm(dir: String, recurse: boolean = false): boolean -> Removes a file or directory

mount(source: String, mountPoint: String, encryptionType: String = "", owner: String = null, extraConfigs: Map = Map.empty[String, String]): boolean -> Mounts the given source directory into DBFS at the given mount point
mounts: Seq -> Displays information about what is mounted within DBFS
refreshMounts: boolean -> Forces all machines in this cluster to refresh their mount cache, ensuring they receive the most recent information
unmount(mountPoint: String): boolean -> Deletes a DBFS mount point

Notebook workflow utilities

Notebook workflows allow you to chain together notebooks and act on their results. See Notebook Workflows. Learn more by running:

dbutils.notebook.help()
exit(value: String): void -> This method lets you exit a notebook with a value
run(path: String, timeoutSeconds: int, arguments: Map): String -> This method runs a notebook and returns its exit value

Widget utilities

Widgets allow you to parameterize notebooks. See Widgets. Learn more by running:

dbutils.widgets.help()
combobox(name: String, defaultValue: String, choices: Seq, label: String): void -> Creates a combobox input widget with a given name, default value and choices
dropdown(name: String, defaultValue: String, choices: Seq, label: String): void -> Creates a dropdown input widget a with given name, default value and choices
get(name: String): String -> Retrieves current value of an input widget
getArgument(name: String, optional: String): String -> (DEPRECATED) Equivalent to get
multiselect(name: String, defaultValue: String, choices: Seq, label: String): void -> Creates a multiselect input widget with a given name, default value and choices
remove(name: String): void -> Removes an input widget from the notebook
removeAll: void -> Removes all widgets in the notebook
text(name: String, defaultValue: String, label: String): void -> Creates a text input widget with a given name and default value

Secrets utilities

Note

Secrets utilities are available only on clusters running Databricks Runtime 4.0 and above.

Secrets allow you to store and access sensitive credential information without making them visible in notebooks. See Secrets and Use the secrets in a notebook. Learn more by running:

dbutils.secrets.help()
get(scope: String, key: String): String -> Gets the string representation of a secret value with scope and key
getBytes(scope: String, key: String): byte[] -> Gets the bytes representation of a secret value with scope and key
list(scope: String): Seq -> Lists secret metadata for secrets within a scope
listScopes: Seq -> Lists secret scopes

Databricks Utilities API library

To accelerate application development, it can be helpful to compile, build, and test applications before you deploy them as production jobs. To enable you to compile against Databricks Utilities, Databricks provides the dbutils-api library. You can download the dbutils-api library or include the library by adding a dependency to your build file:

  • SBT

    libraryDependencies += "com.databricks" % "dbutils-api_2.11" % "0.0.3"
    
  • Maven

    <dependency>
        <groupId>com.databricks</groupId>
        <artifactId>dbutils-api_2.11</artifactId>
        <version>0.0.3</version>
    </dependency>
    
  • Gradle

    compile 'com.databricks:dbutils-api_2.11:0.0.3'
    

Once you build your application against this library, you can deploy the application on a cluster running Databricks Runtime 4.0 and above.

Important

The dbutils-api library allows you to locally compile an application that uses dbutils, but not to run it. To run the application, you must deploy it in Databricks.

Example projects

Here is an example archive containing minimal example projects that show you how to compile using the dbutils-api library for 3 common build tools:

  • sbt: sbt package
  • Maven: mvn install
  • Gradle: gradle build

These commands create output JARs in the locations:

  • sbt: target/scala-2.11/dbutils-api-example_2.11-0.0.1-SNAPSHOT.jar
  • Maven: target/dbutils-api-example-0.0.1-SNAPSHOT.jar
  • Gradle: build/libs/dbutils-api-example-0.0.1-SNAPSHOT.jar

You can attach this JAR to your cluster as a library, restart the cluster (which you must do using Databricks Runtime 4.0), and then run:

example.Test()

This statement creates a text input widget with the label Hello: and the initial value World.

You can use all the other dbutils APIs the same way.

To test an application that uses the dbutils object outside Databricks, you can mock up the dbutils object by calling:

com.databricks.dbutils_v1.DBUtilsHolder.dbutils0.set(
  new com.databricks.dbutils_v1.DBUtilsV1{
    ...
  }
)

Substitute your own DBUtilsV1 instance in which you implement the interface methods however you like, for example providing a local filesystem mockup for dbutils.fs.