Databricks Utilities

Databricks Utilities (dbutils) make it easy to perform powerful combinations of tasks. You can use the utilities to work with object storage efficiently, to chain and parameterize notebooks, and to work with secrets. dbutils are not supported outside of notebooks.

dbutils utilities are available in Python, R, and Scala notebooks.

How to: List utilities, list commands, display command help

Utilities: credentials, fs, library, notebook, secrets, widgets, Utilities API library

List available utilities

To list available utilities along with a short description for each utility, run dbutils.help() for Python or Scala.

This example lists available commands for the Databricks Utilities.

dbutils.help()
dbutils.help()
This module provides various utilities for users to interact with the rest of Databricks.

credentials: DatabricksCredentialUtils -> Utilities for interacting with credentials within notebooks
fs: DbfsUtils -> Manipulates the Databricks filesystem (DBFS) from the console
library: LibraryUtils -> Utilities for session isolated libraries
notebook: NotebookUtils -> Utilities for the control flow of a notebook (EXPERIMENTAL)
secrets: SecretUtils -> Provides utilities for leveraging secrets within notebooks
widgets: WidgetsUtils -> Methods to create and get bound value of input widgets inside notebooks

List available commands for a utility

To list available commands for a utility along with a short description of each command, run .help() after the programmatic name for the utility.

This example lists available commands for the Databricks File System (DBFS) utility.

dbutils.fs.help()
dbutils.fs.help()
dbutils.fs.help()
dbutils.fs provides utilities for working with FileSystems. Most methods in this package can take either a DBFS path (e.g., "/foo" or "dbfs:/foo"), or another FileSystem URI. For more info about a method, use dbutils.fs.help("methodName"). In notebooks, you can also use the %fs shorthand to access DBFS. The %fs shorthand maps straightforwardly onto dbutils calls. For example, "%fs head --maxBytes=10000 /file/path" translates into "dbutils.fs.head("/file/path", maxBytes = 10000)".

fsutils

cp(from: String, to: String, recurse: boolean = false): boolean -> Copies a file or directory, possibly across FileSystems
head(file: String, maxBytes: int = 65536): String -> Returns up to the first 'maxBytes' bytes of the given file as a String encoded in UTF-8
ls(dir: String): Seq -> Lists the contents of a directory
mkdirs(dir: String): boolean -> Creates the given directory if it does not exist, also creating any necessary parent directories
mv(from: String, to: String, recurse: boolean = false): boolean -> Moves a file or directory, possibly across FileSystems
put(file: String, contents: String, overwrite: boolean = false): boolean -> Writes the given String out to a file, encoded in UTF-8
rm(dir: String, recurse: boolean = false): boolean -> Removes a file or directory

mount

mount(source: String, mountPoint: String, encryptionType: String = "", owner: String = null, extraConfigs: Map = Map.empty[String, String]): boolean -> Mounts the given source directory into DBFS at the given mount point
mounts: Seq -> Displays information about what is mounted within DBFS
refreshMounts: boolean -> Forces all machines in this cluster to refresh their mount cache, ensuring they receive the most recent information
unmount(mountPoint: String): boolean -> Deletes a DBFS mount point

Display help for a command

To display help for a command, run .help("<command-name>") after the command name.

This example displays help for the DBFS copy command.

dbutils.fs.help("cp")
dbutils.fs.help("cp")
dbutils.fs.help("cp")
/**
* Copies a file or directory, possibly across FileSystems.
*
* Example: cp("/mnt/my-folder/a", "dbfs://a/b")
*
* @param from FileSystem URI of the source file or directory
* @param to FileSystem URI of the destination file or directory
* @param recurse if true, all files and directories will be recursively copied
* @return true if all files were successfully copied
*/
cp(from: java.lang.String, to: java.lang.String, recurse: boolean = false): boolean

Credentials utility (dbutils.credentials)

Commands: assumeRole, showCurrentRole, showRoles

The credentials utility allows you to interact with credentials within notebooks. This utility is usable only on clusters with credential passthrough enabled. To list the available commands, run dbutils.credentials.help().

assumeRole(role: String): boolean -> Sets the role ARN to assume when looking for credentials to authenticate with S3
showCurrentRole: List -> Shows the currently set role
showRoles: List -> Shows the set of possible assumed roles

assumeRole command (dbutils.credentials.assumeRole)

Sets the Amazon Resource Name (ARN) for the AWS Identity and Access Management (IAM) role to assume when looking for credentials to authenticate with Amazon S3. After you run this command, you can run S3 access commands, such as sc.textFile("s3a://my-bucket/my-file.csv") to access an object.

To display help for this command, run dbutils.credentials.help("assumeRole").

dbutils.credentials.assumeRole("arn:aws:iam::123456789012:roles/my-role")

# Out[4]: True
dbutils.credentials.assumeRole("arn:aws:iam::123456789012:roles/my-role")

// res3: Boolean = true

showCurrentRole command (dbutils.credentials.showCurrentRole)

Lists the currently set AWS Identity and Access Management (IAM) role.

To display help for this command, run dbutils.credentials.help("showCurrentRole").

dbutils.credentials.showCurrentRole()

# Out[1]: ['arn:aws:iam::123456789012:role/my-role-a']
dbutils.credentials.showCurrentRole()

// res0: java.util.List[String] = [arn:aws:iam::123456789012:role/my-role-a]

showRoles command (dbutils.credentials.showRoles)

Lists the set of possible assumed AWS Identity and Access Management (IAM) roles.

To display help for this command, run dbutils.credentials.help("showRoles").

dbutils.credentials.showRoles()

# Out[1]: ['arn:aws:iam::123456789012:role/my-role-a', 'arn:aws:iam::123456789012:role/my-role-b']
dbutils.credentials.showRoles()

// res1: java.util.List[String] = [arn:aws:iam::123456789012:role/my-role-a, arn:aws:iam::123456789012:role/my-role-b]

File system utility (dbutils.fs)

Commands: cp, head, ls, mkdirs, mount, mounts, mv, put, refreshMounts, rm, unmount

The file system utility accesses Databricks File System (DBFS), making it easier to use Databricks as a file system. To list the available commands, run dbutils.fs.help().

dbutils.fs provides utilities for working with FileSystems. Most methods in this package can take either a DBFS path (e.g., "/foo" or "dbfs:/foo"), or another FileSystem URI. For more info about a method, use dbutils.fs.help("methodName"). In notebooks, you can also use the %fs shorthand to access DBFS. The %fs shorthand maps straightforwardly onto dbutils calls. For example, "%fs head --maxBytes=10000 /file/path" translates into "dbutils.fs.head("/file/path", maxBytes = 10000)".

fsutils

cp(from: String, to: String, recurse: boolean = false): boolean -> Copies a file or directory, possibly across FileSystems
head(file: String, maxBytes: int = 65536): String -> Returns up to the first 'maxBytes' bytes of the given file as a String encoded in UTF-8
ls(dir: String): Seq -> Lists the contents of a directory
mkdirs(dir: String): boolean -> Creates the given directory if it does not exist, also creating any necessary parent directories
mv(from: String, to: String, recurse: boolean = false): boolean -> Moves a file or directory, possibly across FileSystems
put(file: String, contents: String, overwrite: boolean = false): boolean -> Writes the given String out to a file, encoded in UTF-8
rm(dir: String, recurse: boolean = false): boolean -> Removes a file or directory

mount

mount(source: String, mountPoint: String, encryptionType: String = "", owner: String = null, extraConfigs: Map = Map.empty[String, String]): boolean -> Mounts the given source directory into DBFS at the given mount point
mounts: Seq -> Displays information about what is mounted within DBFS
refreshMounts: boolean -> Forces all machines in this cluster to refresh their mount cache, ensuring they receive the most recent information
unmount(mountPoint: String): boolean -> Deletes a DBFS mount point

cp command (dbutils.fs.cp)

Copies a file or directory, possibly across filesystems.

To display help for this command, run dbutils.fs.help("cp").

This example copies the file named old_file.txt from /FileStore to /tmp/new, renaming the copied file to new_file.txt.

dbutils.fs.cp("/FileStore/old_file.txt", "/tmp/new/new_file.txt")

# Out[4]: True
dbutils.fs.cp("/FileStore/old_file.txt", "/tmp/new/new_file.txt")

# [1] TRUE
dbutils.fs.cp("/FileStore/old_file.txt", "/tmp/new/new_file.txt")

// res3: Boolean = true

head command (dbutils.fs.head)

Returns up to the specified maximum number bytes of the given file. The bytes are returned as a UTF-8 encoded string.

To display help for this command, run dbutils.fs.help("head").

This example displays the first 25 bytes of the file my_file.txt located in /tmp.

dbutils.fs.head("/tmp/my_file.txt", 25)

# [Truncated to first 25 bytes]
# Out[12]: 'Apache Spark is awesome!\n'
dbutils.fs.head("/tmp/my_file.txt", 25)

# [1] "Apache Spark is awesome!\n"
dbutils.fs.head("/tmp/my_file.txt", 25)

// [Truncated to first 25 bytes]
// res4: String =
// "Apache Spark is awesome!
// "

ls command (dbutils.fs.ls)

Lists the contents of a directory.

To display help for this command, run dbutils.fs.help("ls").

This example displays information about the contents of /tmp.

dbutils.fs.ls("/tmp")

# Out[13]: [FileInfo(path='dbfs:/tmp/my_file.txt', name='my_file.txt', size=40)]
dbutils.fs.ls("/tmp")

# For prettier results from dbutils.fs.ls(<dir>), please use `%fs ls <dir>`

# [[1]]
# [[1]]$path
# [1] "dbfs:/tmp/my_file.txt"

# [[1]]$name
# [1] "my_file.txt"

# [[1]]$size
# [1] 40

# [[1]]$isDir
# [1] FALSE

# [[1]]$isFile
# [1] TRUE
dbutils.fs.ls("/tmp")

// res6: Seq[com.databricks.backend.daemon.dbutils.FileInfo] = WrappedArray(FileInfo(dbfs:/tmp/my_file.txt, my_file.txt, 40))

mkdirs command (dbutils.fs.mkdirs)

Creates the given directory if it does not exist. Also creates any necessary parent directories.

To display help for this command, run dbutils.fs.help("mkdirs").

This example creates the directory structure /parent/child/grandchild within /tmp.

dbutils.fs.mkdirs("/tmp/parent/child/grandchild")

# Out[15]: True
dbutils.fs.mkdirs("/tmp/parent/child/grandchild")

# [1] TRUE
dbutils.fs.mkdirs("/tmp/parent/child/grandchild")

// res7: Boolean = true

mount command (dbutils.fs.mount)

Mounts the specified source directory into DBFS at the specified mount point.

To display help for this command, run dbutils.fs.help("mount").

aws_bucket_name = "my-bucket"
mount_name = "s3-my-bucket"

dbutils.fs.mount("s3a://%s" % aws_bucket_name, "/mnt/%s" % mount_name)
val AwsBucketName = "my-bucket"
val MountName = "s3-my-bucket"

dbutils.fs.mount(s"s3a://$AwsBucketName", s"/mnt/$MountName")

For additional code examples, see Amazon S3.

mounts command (dbutils.fs.mounts)

Displays information about what is currently mounted within DBFS.

To display help for this command, run dbutils.fs.help("mounts").

dbutils.fs.mounts()

# Out[11]: [MountInfo(mountPoint='/mnt/databricks-results', source='databricks-results', encryptionType='sse-s3')]
dbutils.fs.mounts()

For additional code examples, see Amazon S3.

mv command (dbutils.fs.mv)

Moves a file or directory, possibly across filesystems. A move is a copy followed by a delete, even for moves within filesystems.

To display help for this command, run dbutils.fs.help("mv").

This example moves the file my_file.txt from /FileStore to /tmp/parent/child/granchild.

dbutils.fs.mv("/FileStore/my_file.txt", "/tmp/parent/child/grandchild")

# Out[2]: True
dbutils.fs.mv("/FileStore/my_file.txt", "/tmp/parent/child/grandchild")

# [1] TRUE
dbutils.fs.mv("/FileStore/my_file.txt", "/tmp/parent/child/grandchild")

// res1: Boolean = true

put command (dbutils.fs.put)

Writes the specified string to a file. The string is UTF-8 encoded.

To display help for this command, run dbutils.fs.help("put").

This example writes the string Hello, Databricks! to a file named hello_db.txt in /tmp. If the file exists, it will be overwritten.

dbutils.fs.put("/tmp/hello_db.txt", "Hello, Databricks!", True)

# Wrote 18 bytes.
# Out[6]: True
dbutils.fs.put("/tmp/hello_db.txt", "Hello, Databricks!", TRUE)

# [1] TRUE
dbutils.fs.put("/tmp/hello_db.txt", "Hello, Databricks!", true)

// Wrote 18 bytes.
// res2: Boolean = true

refreshMounts command (dbutils.fs.refreshMounts)

Forces all machines in the cluster to refresh their mount cache, ensuring they receive the most recent information.

To display help for this command, run dbutils.fs.help("refreshMounts").

dbutils.fs.refreshMounts()
dbutils.fs.refreshMounts()

For additional code examples, see Amazon S3.

rm command (dbutils.fs.rm)

Removes a file or directory.

To display help for this command, run dbutils.fs.help("rm").

This example removes the file named hello_db.txt in /tmp.

dbutils.fs.rm("/tmp/hello_db.txt")

# Out[8]: True
dbutils.fs.rm("/tmp/hello_db.txt")

# [1] TRUE
dbutils.fs.rm("/tmp/hello_db.txt")

// res6: Boolean = true

unmount command (dbutils.fs.unmount)

Deletes a DBFS mount point.

To display help for this command, run dbutils.fs.help("unmount").

dbutils.fs.unmount("/mnt/<mount-name>")

For additional code examples, see Amazon S3.

Library utility (dbutils.library)

Note

The library utility is deprecated.

Commands: install, installPyPI, list, restartPython, updateCondaEnv

The library utility allows you to install Python libraries and create an environment scoped to a notebook session. The libraries are available both on the driver and on the executors, so you can reference them in user defined functions. This enables:

  • Library dependencies of a notebook to be organized within the notebook itself.
  • Notebook users with different library dependencies to share a cluster without interference.

Detaching a notebook destroys this environment. However, you can recreate it by re-running the library install API commands in the notebook. See the restartPython API for how you can reset your notebook state without losing your environment.

Important

Library utilities are not available on Databricks Runtime ML or Databricks Runtime for Genomics. Instead, refer to Notebook-scoped Python libraries.

For Databricks Runtime 7.2 and above, Databricks recommends using %pip magic commands to install notebook-scoped libraries. See Notebook-scoped Python libraries.

Library utilities are enabled by default. Therefore, by default the Python environment for each notebook is isolated by using a separate Python executable that is created when the notebook is attached to and inherits the default Python environment on the cluster. Libraries installed through an init script into the Databricks Python environment are still available. You can disable this feature by setting spark.databricks.libraryIsolation.enabled to false.

This API is compatible with the existing cluster-wide library installation through the UI and REST API. Libraries installed through this API have higher priority than cluster-wide libraries.

To list the available commands, run dbutils.library.help().

install(path: String): boolean -> Install the library within the current notebook session
installPyPI(pypiPackage: String, version: String = "", repo: String = "", extras: String = ""): boolean -> Install the PyPI library within the current notebook session
list: List -> List the isolated libraries added for the current notebook session via dbutils
restartPython: void -> Restart python process for the current notebook session
updateCondaEnv(envYmlContent: String): boolean -> Update the current notebook's Conda environment based on the specification (content of environment

install command (dbutils.library.install)

Given a path to a library, installs that library within the current notebook session. Libraries installed by calling this command are available only to the current notebook.

To display help for this command, run dbutils.library.help("install").

This example installs a .egg or .whl library within a notebook.

Important

Databricks recommends that you put all your library install commands in the first cell of your notebook and call restartPython at the end of that cell. The Python notebook state is reset after running restartPython; the notebook loses all state including but not limited to local variables, imported libraries, and other ephemeral states. Therefore, we recommend that you install libraries and reset the notebook state in the first notebook cell.

The accepted library sources are dbfs and s3.

dbutils.library.install("dbfs:/path/to/your/library.egg")
dbutils.library.restartPython() # Removes Python state, but some libraries might not work without calling this  command.
dbutils.library.install("dbfs:/path/to/your/library.whl")
dbutils.library.restartPython() # Removes Python state, but some libraries might not work without calling this command.

Note

You can directly install custom wheel files using %pip. In the following example we are assuming you have uploaded your library wheel file to DBFS:

%pip install /dbfs/path/to/your/library.whl

Egg files are not supported by pip, and wheel is considered the standard for build and binary packaging for Python. See Wheel vs Egg for more details. However, if you want to use an egg file in a way that’s compatible with %pip, you can use the following workaround:

# This step is only needed if no %pip commands have been run yet.
# It will trigger setting up the isolated notebook environment
%pip install <any-lib>  # This doesn't need to be a real library; for example "%pip install any-lib" would work
import sys
# Assuming the preceding step was completed, the following command
# adds the egg file to the current notebook environment
sys.path.append("/local/path/to/library.egg")

installPyPI command (dbutils.library.installPyPI)

Given a Python Package Index (PyPI) package, install that package within the current notebook session. Libraries installed by calling this command are isolated among notebooks.

To display help for this command, run dbutils.library.help("installPyPI").

This example installs a PyPI package in a notebook. version, repo, and extras are optional. Use the extras argument to specify the Extras feature (extra requirements).

dbutils.library.installPyPI("pypipackage", version="version", repo="repo", extras="extras")
dbutils.library.restartPython()  # Removes Python state, but some libraries might not work without calling this command.

Important

The version and extras keys cannot be part of the PyPI package string. For example: dbutils.library.installPyPI("azureml-sdk[databricks]==1.19.0") is not valid. Use the version and extras arguments to specify the version and extras information as follows:

dbutils.library.installPyPI("azureml-sdk", version="1.19.0", extras="databricks")
dbutils.library.restartPython()  # Removes Python state, but some libraries might not work without calling this command.

Note

When replacing dbutils.library.installPyPI commands with %pip commands, the Python interpreter is automatically restarted. You can run the install command as follows:

%pip install azureml-sdk[databricks]==1.19.0

This example specifies library requirements in one notebook and installs them by using %run in the other. To do this, first define the libraries to install in a notebook. This example uses a notebook named InstallDependencies.

dbutils.library.installPyPI("torch")
dbutils.library.installPyPI("scikit-learn", version="1.19.1")
dbutils.library.installPyPI("azureml-sdk", extras="databricks")
dbutils.library.restartPython() # Removes Python state, but some libraries might not work without calling this command.

Then install them in the notebook that needs those dependencies.

%run /path/to/InstallDependencies # Install the dependencies in the first cell.
import torch
from sklearn.linear_model import LinearRegression
import azureml
...

This example resets the Python notebook state while maintaining the environment. This technique is available only in Python notebooks. For example, you can use this technique to reload libraries Databricks preinstalled with a different version:

dbutils.library.installPyPI("numpy", version="1.15.4")
dbutils.library.restartPython()
# Make sure you start using the library in another cell.
import numpy

You can also use this technique to install libraries such as tensorflow that need to be loaded on process start up:

dbutils.library.installPyPI("tensorflow")
dbutils.library.restartPython()
# Use the library in another cell.
import tensorflow

list command (dbutils.library.list)

Lists the isolated libraries added for the current notebook session through the library utility. This does not include libraries that are attached to the cluster.

To display help for this command, run dbutils.library.help("list").

This example lists the libraries installed in a notebook.

dbutils.library.list()

Note

The equivalent of this command using %pip is:

%pip freeze

restartPython command (dbutils.library.restartPython)

Restarts the Python process for the current notebook session.

To display help for this command, run dbutils.library.help("restartPython").

This example restarts the Python process for the current notebook session.

dbutils.library.restartPython() # Removes Python state, but some libraries might not work without calling this command.

updateCondaEnv command (dbutils.library.updateCondaEnv)

Updates the current notebook’s Conda environment based on the contents of environment.yml. This method is supported only for Databricks Runtime on Conda.

To display help for this command, run dbutils.library.help("updateCondaEnv").

This example updates the current notebook’s Conda environment based on the contents of the provided specification.

dbutils.library.updateCondaEnv(
"""
channels:
  - anaconda
dependencies:
  - gensim=3.4
  - nltk=3.4
""")

Notebook utility (dbutils.notebook)

Commands: exit, run

The notebook utility allows you to chain together notebooks and act on their results. See Notebook workflows.

To list the available commands, run dbutils.notebook.help().

exit(value: String): void -> This method lets you exit a notebook with a value
run(path: String, timeoutSeconds: int, arguments: Map): String -> This method runs a notebook and returns its exit value.

exit command (dbutils.notebook.exit)

Exits a notebook with a value.

To display help for this command, run dbutils.notebook.help("exit").

This example exits the notebook with the value Exiting from My Other Notebook.

dbutils.notebook.exit("Exiting from My Other Notebook")

# Notebook exited: Exiting from My Other Notebook
dbutils.notebook.exit("Exiting from My Other Notebook")

# Notebook exited: Exiting from My Other Notebook
dbutils.notebook.exit("Exiting from My Other Notebook")

// Notebook exited: Exiting from My Other Notebook

run command (dbutils.notebook.run)

Runs a notebook and returns its exit value. The notebook will run in the current cluster by default.

Note

The maximum length of the string value returned from the run command is 5 MB. See Runs get output.

To display help for this command, run dbutils.notebook.help("run").

This example runs a notebook named My Other Notebook in the same location as the calling notebook. The called notebook ends with the line of code dbutils.notebook.exit("Exiting from My Other Notebook"). If the called notebook does not finish running within 60 seconds, an exception is thrown.

dbutils.notebook.run("My Other Notebook", 60)

# Out[14]: 'Exiting from My Other Notebook'
dbutils.notebook.run("My Other Notebook", 60)

// res2: String = Exiting from My Other Notebook

Secrets utility (dbutils.secrets)

Commands: get, getBytes, list, listScopes

The secrets utility allows you to store and access sensitive credential information without making them visible in notebooks. See Secret management and Use the secrets in a notebook. To list the available commands, run dbutils.secrets.help().

get(scope: String, key: String): String -> Gets the string representation of a secret value with scope and key
getBytes(scope: String, key: String): byte[] -> Gets the bytes representation of a secret value with scope and key
list(scope: String): Seq -> Lists secret metadata for secrets within a scope
listScopes: Seq -> Lists secret scopes

get command (dbutils.secrets.get)

Gets the string representation of a secret value for the specified secrets scope and key.

Warning

Administrators, secret creators, and users granted permission can read Databricks secrets. While Databricks makes an effort to redact secret values that might be displayed in notebooks, it is not possible to prevent such users from reading secrets. For more information, see Secret redaction.

To display help for this command, run dbutils.secrets.help("get").

This example gets the string representation of the secret value for the scope named my-scope and the key named my-key.

dbutils.secrets.get(scope="my-scope", key="my-key")

# Out[14]: '[REDACTED]'
dbutils.secrets.get(scope="my-scope", key="my-key")

# [1] "[REDACTED]"
dbutils.secrets.get(scope="my-scope", key="my-key")

// res0: String = [REDACTED]

getBytes command (dbutils.secrets.getBytes)

Gets the bytes representation of a secret value for the specified scope and key.

To display help for this command, run dbutils.secrets.help("getBytes").

This example gets the secret value (a1!b2@c3#) for the scope named my-scope and the key named my-key.

my_secret = dbutils.secrets.getBytes(scope="my-scope", key="my-key")
my_secret.decode("utf-8")

# Out[1]: 'a1!b2@c3#'
my_secret = dbutils.secrets.getBytes(scope="my-scope", key="my-key")
print(rawToChar(my_secret))

# [1] "a1!b2@c3#"
val mySecret = dbutils.secrets.getBytes(scope="my-scope", key="my-key")
val convertedString = new String(mySecret)
println(convertedString)

// a1!b2@c3#
// mySecret: Array[Byte] = Array(97, 49, 33, 98, 50, 64, 99, 51, 35)
// convertedString: String = a1!b2@c3#

list command (dbutils.secrets.list)

Lists the metadata for secrets within the specified scope.

To display help for this command, run dbutils.secrets.help("list").

This example lists the metadata for secrets within the scope named my-scope.

dbutils.secrets.list("my-scope")

# Out[10]: [SecretMetadata(key='my-key')]
dbutils.secrets.list("my-scope")

# [[1]]
# [[1]]$key
# [1] "my-key"
dbutils.secrets.list("my-scope")

// res2: Seq[com.databricks.dbutils_v1.SecretMetadata] = ArrayBuffer(SecretMetadata(my-key))

listScopes command (dbutils.secrets.listScopes)

Lists the available scopes.

To display help for this command, run dbutils.secrets.help("listScopes").

This example lists the available scopes.

dbutils.secrets.listScopes()

# Out[14]: [SecretScope(name='my-scope')]
dbutils.secrets.listScopes()

# [[1]]
# [[1]]$name
# [1] "my-scope"
dbutils.secrets.listScopes()

// res3: Seq[com.databricks.dbutils_v1.SecretScope] = ArrayBuffer(SecretScope(my-scope))

Widgets utility (dbutils.widgets)

Commands: combobox, dropdown, get, getArgument, multiselect, remove, removeAll, text

Widgets allow you to parameterize notebooks. See Widgets.

To list the available commands, run dbutils.widgets.help().

combobox(name: String, defaultValue: String, choices: Seq, label: String): void -> Creates a combobox input widget with a given name, default value and choices
dropdown(name: String, defaultValue: String, choices: Seq, label: String): void -> Creates a dropdown input widget a with given name, default value and choices
get(name: String): String -> Retrieves current value of an input widget
getArgument(name: String, optional: String): String -> (DEPRECATED) Equivalent to get
multiselect(name: String, defaultValue: String, choices: Seq, label: String): void -> Creates a multiselect input widget with a given name, default value and choices
remove(name: String): void -> Removes an input widget from the notebook
removeAll: void -> Removes all widgets in the notebook
text(name: String, defaultValue: String, label: String): void -> Creates a text input widget with a given name and default value

combobox command (dbutils.widgets.combobox)

Creates and displays a combobox widget with the specified programmatic name, default value, choices, and optional label.

To display help for this command, run dbutils.widgets.help("combobox").

This example creates and displays a combobox widget with the programmatic name fruits_combobox. It offers the choices apple, banana, coconut, and dragon fruit and is set to the initial value of banana. This combobox widget has an accompanying label Fruits. This example ends by printing the initial value of the combobox widget, banana.

dbutils.widgets.combobox(
  name='fruits_combobox',
  defaultValue='banana',
  choices=['apple', 'banana', 'coconut', 'dragon fruit'],
  label='Fruits'
)

print(dbutils.widgets.get("fruits_combobox"))

# banana
dbutils.widgets.combobox(
  name='fruits_combobox',
  defaultValue='banana',
  choices=list('apple', 'banana', 'coconut', 'dragon fruit'),
  label='Fruits'
)

print(dbutils.widgets.get("fruits_combobox"))

# [1] "banana"
dbutils.widgets.combobox(
  "fruits_combobox",
  "banana",
  Array("apple", "banana", "coconut", "dragon fruit"),
  "Fruits"
)

print(dbutils.widgets.get("fruits_combobox"))

// banana

get command (dbutils.widgets.get)

Gets the current value of the widget with the specified programmatic name. This programmatic name can be either:

  • The name of a custom widget in the notebook, for example fruits_combobox or toys_dropdown.
  • The name of a custom parameter passed to the notebook as part of a notebook task, for example name or age. For more information, see the coverage of parameters for notebook tasks in the Create a job UI or the notebook_params field in the Run now API.

To display help for this command, run dbutils.widgets.help("get").

This example gets the value of the widget that has the programmatic name fruits_combobox.

dbutils.widgets.get('fruits_combobox')

# banana
dbutils.widgets.get('fruits_combobox')

# [1] "banana"
dbutils.widgets.get("fruits_combobox")

// res6: String = banana

This example gets the value of the notebook task parameter that has the programmatic name age. This parameter was set to 35 when the related notebook task was run.

dbutils.widgets.get('age')

# 35
dbutils.widgets.get('age')

# [1] "35"
dbutils.widgets.get("age")

// res6: String = 35

getArgument command (dbutils.widgets.getArgument)

Gets the current value of the widget with the specified programmatic name. If the widget does not exist, an optional message can be returned.

Note

This command is deprecated. Use dbutils.widgets.get instead.

To display help for this command, run dbutils.widgets.help("getArgument").

This example gets the value of the widget that has the programmatic name fruits_combobox. If this widget does not exist, the message Error: Cannot find fruits combobox is returned.

dbutils.widgets.getArgument('fruits_combobox', 'Error: Cannot find fruits combobox')

# Deprecation warning: Use dbutils.widgets.text() or dbutils.widgets.dropdown() to create a widget and dbutils.widgets.get() to get its bound value.
# Out[3]: 'banana'
dbutils.widgets.getArgument('fruits_combobox', 'Error: Cannot find fruits combobox')

# Deprecation warning: Use dbutils.widgets.text() or dbutils.widgets.dropdown() to create a widget and dbutils.widgets.get() to get its bound value.
# [1] "banana"
dbutils.widgets.getArgument("fruits_combobox", "Error: Cannot find fruits combobox")

// command-1234567890123456:1: warning: method getArgument in trait WidgetsUtils is deprecated: Use dbutils.widgets.text() or dbutils.widgets.dropdown() to create a widget and dbutils.widgets.get() to get its bound value.
// dbutils.widgets.getArgument("fruits_combobox", "Error: Cannot find fruits combobox")
//                 ^
// res7: String = banana

multiselect command (dbutils.widgets.multiselect)

Creates and displays a multiselect widget with the specified programmatic name, default value, choices, and optional label.

To display help for this command, run dbutils.widgets.help("multiselect").

This example creates and displays a multiselect widget with the programmatic name days_multiselect. It offers the choices Monday through Sunday and is set to the initial value of Tuesday. This multiselect widget has an accompanying label Days of the Week. This example ends by printing the initial value of the multiselect widget, Tuesday.

dbutils.widgets.multiselect(
  name='days_multiselect',
  defaultValue='Tuesday',
  choices=['Monday', 'Tuesday', 'Wednesday', 'Thursday',
    'Friday', 'Saturday', 'Sunday'],
  label='Days of the Week'
)

print(dbutils.widgets.get("days_multiselect"))

# Tuesday
dbutils.widgets.multiselect(
  name='days_multiselect',
  defaultValue='Tuesday',
  choices=list('Monday', 'Tuesday', 'Wednesday', 'Thursday',
    'Friday', 'Saturday', 'Sunday'),
  label='Days of the Week'
)

print(dbutils.widgets.get("days_multiselect"))

# [1] "Tuesday"
dbutils.widgets.multiselect(
  "days_multiselect",
  "Tuesday",
  Array("Monday", "Tuesday", "Wednesday", "Thursday",
    "Friday", "Saturday", "Sunday"),
  "Days of the Week"
)

print(dbutils.widgets.get("days_multiselect"))

// Tuesday

remove command (dbutils.widgets.remove)

Removes the widget with the specified programmatic name.

To display help for this command, run dbutils.widgets.help("remove").

This example removes the widget with the programmatic name fruits_combobox.

dbutils.widgets.remove('fruits_combobox')
dbutils.widgets.remove('fruits_combobox')
dbutils.widgets.remove("fruits_combobox")

removeAll command (dbutils.widgets.removeAll)

Removes all widgets from the notebook.

To display help for this command, run dbutils.widgets.help("removeAll").

This example removes all widgets from the notebook.

dbutils.widgets.removeAll()
dbutils.widgets.removeAll()
dbutils.widgets.removeAll()

text command (dbutils.widgets.text)

Creates and displays a text widget with the specified programmatic name, default value, and optional label.

To display help for this command, run dbutils.widgets.help("text").

This example creates and displays a text widget with the programmatic name your_name_text. It is set to the initial value of Enter your name. This text widget has an accompanying label Your name. This example ends by printing the initial value of the text widget, Enter your name.

dbutils.widgets.text(
  name='your_name_text',
  defaultValue='Enter your name',
  label='Your name'
)

print(dbutils.widgets.get("your_name_text"))

# Enter your name
dbutils.widgets.text(
  name='your_name_text',
  defaultValue='Enter your name',
  label='Your name'
)

print(dbutils.widgets.get("your_name_text"))

# [1] "Enter your name"
dbutils.widgets.text(
  "your_name_text",
  "Enter your name",
  "Your name"
)

print(dbutils.widgets.get("your_name_text"))

// Enter your name

Databricks Utilities API library

To accelerate application development, it can be helpful to compile, build, and test applications before you deploy them as production jobs. To enable you to compile against Databricks Utilities, Databricks provides the dbutils-api library. You can download the dbutils-api library from the DBUtils API webpage on the Maven Repository website or include the library by adding a dependency to your build file:

  • SBT

    libraryDependencies += "com.databricks" % "dbutils-api_TARGET" % "VERSION"
    
  • Maven

    <dependency>
        <groupId>com.databricks</groupId>
        <artifactId>dbutils-api_TARGET</artifactId>
        <version>VERSION</version>
    </dependency>
    
  • Gradle

    compile 'com.databricks:dbutils-api_TARGET:VERSION'
    

Replace TARGET with the desired target (for example 2.12) and VERSION with the desired version (for example 0.0.5). For a list of available targets and versions, see the DBUtils API webpage on the Maven Repository website.

Once you build your application against this library, you can deploy the application.

Important

The dbutils-api library allows you to locally compile an application that uses dbutils, but not to run it. To run the application, you must deploy it in Databricks.

Limitations

Calling dbutils inside of executors can produce unexpected results. For information about executors, see Cluster Mode Overview on the Apache Spark website.