What files can I reference in an init script?

The support for referencing other files in an init script depends on where the referenced files are stored. This article outlines this behavior and provides recommendations.

Databricks recommends managing all init scripts as cluster-scoped init scripts. Init scripts should be stored in Unity Catalog volumes if using compute with shared or assigned access mode. Workspace files should be used for init scripts if using compute with no-isolation shared access mode.

What identity is used to run init scripts?

In assigned access mode, the identity of the assigned principal (a user or service principal) is used.

In shared access mode or no-isolation shared access mode, init scripts use the identity of the cluster owner.

Not all locations for storing init scripts are supported on all Databricks Runtime versions and access modes. See Compute compatibility with libraries and init scripts.

Can I reference files in Unity Catalog volumes from init scripts?

You can reference libraries and init scripts stored in Unity Catalog volumes from init scripts stored in Unity Catalog volumes.

Important

Credentials required to access other files stored in Unity Catalog volumes are only made available within init scripts stored in Unity Catalog volumes. You cannot reference any files in Unity Catalog volumes from init scripts configured from other locations.

For clusters with shared access mode, only the configured init script needs to be added to the allowlist. Access to other files referenced in the init script is governed by Unity Catalog.

Can I reference workspace files from init scripts?

You cannot reference libraries or init scripts stored in workspace files from init scripts. This includes libraries, init scripts, or other files stored in Databricks Repos.

Can I reference files in cloud object storage from init scripts?

You can reference libraries and init scripts stored in cloud object storage from init scripts.

For clusters with shared access mode, only the configured init script needs to be added to the allowlist. Access to other files referenced in the init script is determined by access configured to cloud object storage.

Databricks recommends using instance profiles to manage access to libraries and init scripts stored in S3. Use the following documentation in the cross-reference link to complete this setup:

  1. Create a IAM role with read and list permissions on your desired buckets. See Configure S3 access with instance profiles.

  2. Launch a cluster with the instance profile. See Launch a compute resource with an instance profile.

Can I reference files in the DBFS root from init scripts?

Databricks no longer recommends storing or configuring init scripts in the DBFS root. This pattern is considered deprecated and support might be removed in a future release.

Important

Libraries uploaded using the library UI are stored in the DBFS root. All workspace users have the ability to modify data and files stored in the DBFS root. You can avoid this by uploading libraries to workspace files or Unity Catalog volumes, using libraries in cloud object storage or using library package repositories.

You can reference both libraries and init scripts stored in DBFS root from init scripts.