Manage identities, permissions, and privileges for Lakeflow Jobs
This article contains recommendations and instructions for managing identities, permissions, and privileges for Lakeflow Jobs.
Secrets are not redacted from a cluster's Spark driver log stdout
and stderr
streams. To protect sensitive data, by default, Spark driver logs are viewable only by users with CAN MANAGE permission on job, dedicated access mode, and standard access mode clusters. To allow users with CAN ATTACH TO or CAN RESTART permission to view the logs on these clusters, set the following Spark configuration property in the cluster configuration: spark.databricks.acl.needAdminPermissionToViewLogs false
.
On No Isolation Shared access mode clusters, the Spark driver logs can be viewed by users with CAN ATTACH TO, CAN RESTART, or CAN MANAGE permission. To limit who can read the logs to only users with the CAN MANAGE permission, set spark.databricks.acl.needAdminPermissionToViewLogs
to true
.
See Spark configuration to learn how to add Spark properties to a cluster configuration.
Job privileges and the Run-as user
Jobs are objects in Databricks, and have privileges that let you access or manage those jobs. This page describes these privileges as job priviliges (or permissions).
Jobs also run and perform tasks on behalf of a user (or principal) who has their own privileges to act upon the resources that the job references. The user the job acts as is called the Run as user, and those privileges are referred to on this page as Run as privileges (or permissions). The Run as user's privileges are used when the job is run.
For example, if user A creates a job, and sets the Run as user to user B, the job will run with user B's privileges. If user C runs the job, the job will still run with user B's privileges. This means that it is possible to give someone the ability to run a job to get information from datasets that they do not themselves have access to.
Default privileges for jobs
Jobs have the following privileges set by default:
- The creator of the job is granted the IS OWNER permission on the job.
- Workspace admins are granted the CAN MANAGE permission on the job.
- The creator of the job is set for Run as.
- The Run as user's permissions are used when running the job (including the tasks within the job).
Because the default is to set the creator to both the owner and the Run as user, the creator's privileges are used when running the job by default. Databricks recommends changing the Run as user to a service principal, so that privileges can be controlled separately from the owner's privileges, and so that jobs do not break when the owner leaves or has changed privileges.
Admin permissions for jobs
By default, workspace admins can change the job owner or Run as configuration to any user or service principal in the workspace. Account admins can configure the RestrictWorkspaceAdmins
setting to change this behavior. See Restrict workspace admins.
How do jobs interact with Unity Catalog permissions?
Jobs run as the identity of the user in the Run as setting. When running tasks, this identity is evaluated against permission grants for the following resources that might be used in those tasks:
- Unity Catalog-managed assets, including tables, volumes, models, and views.
- Legacy table access control lists (ACLs) for assets registered in the legacy Hive metastore.
- ACLs for compute, notebooks, queries, and other workspace assets.
- Databricks secrets. See Secret management.
Unity Catalog grants and legacy table ACLs require compatible compute access modes. See Configure compute for jobs.
When are privileges evaluated?
Job privileges are evaluated when a user performs an action on that job, such as editing or running the job.
Run as privileges are evaluated during a job run. So each task might check privileges when the task starts or during the run.
Not all Run as privileges are verified at the beginning of a job run. If you change Run as user's privileges while a job is running, especially if you remove privileges, the job might fail before it completes.
SQL tasks and permissions
The file task is the only SQL task type to respect the Run as user fully.
SQL queries, alerts, and legacy dashboard tasks respect configured sharing settings.
- Run as owner: Runs of the scheduled SQL task always use the identity of the owner of the configured SQL asset.
- Run as viewer: Runs of the scheduled SQL task always use the identity set in the job Run as field.
To learn more about query sharing settings, see Configure query permissions.
Example
The following scenario illustrates the interaction of SQL sharing settings and the job Run as setting:
- User A is the owner of the SQL query named
my_query
. - User A configures
my_query
with the sharing setting Run as owner. - User B schedules
my_query
as a task in a job namedmy_job
. - User B configures
my_job
to run with a service principal namedprod_sp
. - When
my_job
runs, it uses the identity for User A to runmy_query
.
Now assume that User B does not want this behavior. Starting from the existing configuration, the following occurs:
- User A changes the sharing setting for
my_query
to Run as viewer. - When
my_job
runs, it uses the identifyprod_sp
.
Configure the Run as user for job runs
To change the Run as setting, you must have the CAN MANAGE or IS OWNER permission on the job.
You can set the Run as setting to yourself or any service principal in the workspace on which you have the Service Principal User entitlement.
To configure the Run as setting for a job in the workspace UI, select an existing job using the following steps:
- In your Databricks workspace's sidebar, click Jobs & Pipelines.
- Optionally, select the Jobs and Owned by me filters, to make it easier to find the job.
- Click the name of the job in the list.
- In the Job details side panel, click the pencil icon next to the Run as field.
- Search for and select a user or service principal.
- Click Save.
For more information on working with service principals, see the following:
- Service principals
- Roles for managing service principals
- List the service principals that you can use.
Best practices for jobs governance
Databricks recommends the following for all production jobs:
-
Run production jobs using a service principal
Jobs Run as the job creator by default. If the Run as user leaves your organization, the job might fail.
If you assign the Run as user to a service principal, job runs use the permissions of the service principal, and won't change when users leave or have changed privileges.
By default, workspace admins can manage job permissions and reassign ownership if necessary.
Using service principals for production jobs also allows you to restrict write permissions on production data. If you run jobs using a user's permissions, that user needs the same permissions to edit the production data required by the job.
-
Always use Unity Catalog-compatible compute configurations
Unity Catalog data governance requires that you use a supported compute configuration.
Serverless compute for jobs and SQL warehouses always use Unity Catalog.
For jobs with classic compute, Databricks recommends standard access mode for supported workloads. Use dedicated access mode when required.
Lakeflow Declarative Pipelines configured with Unity Catalog have some limitations. See Limitations.
-
Restrict job privileges on production jobs
Job privileges control who can view, run, or manage jobs.
- Users that view the job configuration or monitor runs need the Can View permission.
- Users that trigger, stop, or restart job runs need the Can Manage Run permission.
- Only grant Can Manage or Is Owner privileges to users trusted to modify production code.
Control access to a job
Job access control enables job owners and administrators to grant fine-grained permissions on jobs. The following permissions are available:
Each permission includes the grants of permissions below it in the following table.
Permission | Grant |
---|---|
IS OWNER | The identity used for Run as by default. You can set the Run as user to override this. |
CAN MANAGE | Can edit the job definition, including configuration, tasks, and permissions. Can pause and resume a schedule. |
CAN MANAGE RUN | Can trigger and cancel job runs. |
CAN VIEW | Can view job run results, including details, history, and status. |
-
The creator of a job has the IS OWNER permission by default.
-
A job cannot have more than one owner.
-
A group cannot be assigned the IS OWNER permission as an owner.
-
Jobs triggered through Run Now assume the permissions of the Run as user (the owner, by default) and not the user who issued Run Now.
-
Jobs access control applies to jobs displayed in the Jobs & Pipelines UI and their runs. It doesn't apply to:
-
Notebook workflows that run modular or linked code. These use the permissions of the notebook itself. If the notebook comes from Git, a new copy is created and its files inherit the permissions of the user who triggered the run.
-
Jobs submitted by API. These use the notebook's default permissions unless you explicitly set the
access_control_list
in the API request.
-
For information on job permission levels, see Job ACLs.
Configure job permissions
To configure permissions for a job in the workspace UI, select an existing job using the following steps:
- In your Databricks workspace's sidebar, click Jobs & Pipelines.
- Optionally, select the Jobs and Owned by me filters.
- Click your job's Name link.
- In the Job details panel, click Edit permissions. The Permission Settings dialog appears.
- Click the Select User, Group or Service Principal… field and start typing a user, group, or service principal. The field searches all available identities in the workspace.
- Click Add.
- Click Save.
Manage the job owner
Only workspace admins can edit the job owner. Exactly one job owner must be assigned. Job owners can be users or service principals.