Set and use environment variables with init scripts

Init scripts have access to the environment variables present on a cluster.

note

In standard access mode on Databricks Runtime 19 and above, only a predefined set of environment variables is available to init scripts. Other variables that you set on a cluster remain available to your user code, including UDFs, but aren't available to init scripts. See Environment variable limitations.

Default environment variables

Databricks sets many default variables that can be useful in init script logic. Cluster-scoped and global init scripts support the following environment variables:

DB_CLUSTER_ID: the ID of the cluster on which the script is running. See the Clusters API.
DB_CONTAINER_IP: the private IP address of the container in which Spark runs. The init script is run inside this container. See the Clusters API.
DB_IS_DRIVER: whether the script is running on a driver node.
DB_DRIVER_IP: the IP address of the driver node.
DB_INSTANCE_TYPE: the instance type of the host VM.
DB_CLUSTER_NAME: the name of the cluster the script is executing on.
DB_IS_JOB_CLUSTER: whether the cluster was created to run a job. See Configure compute for jobs.

You cannot override these predefined environment variables.

Set custom environment variables

Custom environment variables that you can access from init scripts running on the compute resource can be set in the Spark config. See Environment variables.

You can also set environment variables using the spark_env_vars field in the Create cluster API or Update cluster API.

Use environment variables

The following example uses a default environment variable to run part of a script only on a driver node:

Bash
echo $DB_IS_DRIVER
if [[ $DB_IS_DRIVER = "TRUE" ]]; then
  <run this part only on driver>
else
  <run this part only on workers>
fi
<run this part on both driver and workers>

Secrets in init scripts

You can use any valid variable name when you reference a secret. Access to secrets referenced in environment variables is determined by the permissions of the user who configured the cluster. Secrets stored in environment variables are accessible by all users of the cluster, but are redacted from plaintext display.

See Use a secret in a Spark configuration property or environment variable.

Default environment variables​

Set custom environment variables​

Use environment variables​

Secrets in init scripts​

Default environment variables

Set custom environment variables

Use environment variables

Secrets in init scripts