Manage users, service principals, and groups

This article introduces the Databricks identity management model and provides an overview of how to manage users, groups, and service principals in Databricks.

For an opinionated perspective on how to best configure identity in Databricks, see Identity best practices.

Databricks identities and roles

There are three types of Databricks identity:

  • Users: User identities recognized by Databricks and represented by email addresses.

  • Service principals: Identities for use with jobs, automated tools, and systems such as scripts, apps, and CI/CD platforms.

  • Groups: A collection of identities used by admins to manage group access to workspaces, data, and other securable objects. All Databricks identities can be assigned as members of groups.

There are five roles defined in Databricks:

  • Account admins can manage your Databricks account-level configurations, including creation of workspaces, Unity Catalog metastores, billing, and cloud resources. Account admins can add users to the account and assign them admin roles. They can also give users access to workspaces, as long as those workspaces use identity federation.

  • Workspace admins can add users to a Databricks workspace, assign them the workspace admin role, and manage access to objects and functionality in the workspace, such as the ability to create clusters or access specified persona-based environments.

  • Metastore admins can manage privileges for all securable objects within a Unity Catalog metastore, such as who can create catalogs or query a table.

  • Account users can use the account console to view and connect to their workspaces. Account and workspace admins can add users to the account.

  • Workspace users perform data science, data engineering, and data analysis tasks in workspaces. Account and workspace admins can give account users access to workspaces, as long as those workspaces use identity federation.

Who can manage identities in Databricks?

To manage identities in Databricks, you must be either an account admin or a workspace admin. The following table details the specific permission needed for user management actions:

Action

Who can perform this action?

Add users and service principals

Account admins can add users and service principals to the account.

Workspace admins can add users and service principals to the account from their workspaces.

Update users and service principals

Account admins can update users and service principals in the account.

Delete users and service principals

Account admins can delete users and service principals from the account.

Add groups

Account admins can add groups to the account.

Workspace admins can add workspace-local groups to the workspace admin’s workspaces.

Update groups

Account admins can update groups in the account.

Workspace admins can update workspace-local groups in the workspace admin’s workspaces.

Delete groups

Account admins can delete groups from the account.

Workspace admins can delete workspace-local groups from the workspace admin’s workspaces.

You can have a maximum of 10,000 combined users and service principals and 5000 groups in an account. Each workspace can have a maximum of 10,000 combined users and service principals and 5000 groups.

For detailed instructions, see:

How do admins assign users to workspaces?

To enable a user, service principal, or group to work in a Databricks workspace, an account admin or workspace admin needs to assign them to a workspace.

Account admins can assign workspace access to users, service principals, and groups that exist in the account as long as a workspace is enabled for identity federation. Workspace admins can also assign users, service principals, and groups to their identity-federated workspaces.

Conversely, if you add a new user or service principal directly to a workspace, that user or service principal will automatically be added to the account and assigned to that workspace. Databricks does not recommend this upstream flow. It’s more straightforward to add your users to the account and then assign them to workspaces. See How does Databricks sync identities between workspaces and the account?.

Groups created directly in workspaces, known as workspace-local groups, are not automatically added to the account. You can manually recreate a workspace-local group as a group in the account. See Special considerations for groups.

Account-level identity diagram

For those workspaces that aren’t enabled for identity federation, workspace admins manage their workspace users, service principals, and groups entirely within the scope of the workspace (the legacy model). They can’t use the account console or account-level APIs to assign users from the account to these workspaces, but they can use any of the workspace-level interfaces.

Whenever a user or service principal is added to the workspace, that user or service principal will be synchronized to the account level. Whenever a user or service principal is deleted from the account level, that user will lose access to all of their workspaces in the account, regardless of whether or not identity federation as been enabled. Whenever a group is added to the workspace, that group will be a workspace-local group and it will not be added to the account. See Special considerations for groups.

For detailed instructions, see:

How do admins enable identity federation on a workspace?

To enable identity federation in a workspace, an admin needs to enable the workspace for Unity Catalog by assigning a Unity Catalog metastore. See Enable a workspace for Unity Catalog.

How does Databricks sync identities between workspaces and the account?

In August 2022, all existing workspace users and service principals were synced automatically to your account as account-level users and service principals. Databricks will continue to sync users or service principals to the account whenever you add them to a workspace. If the workspace user shares a username (email address) with an account-level user or admin that already exists, those users are merged.

Important

If an account admin removes a user or service principal at the account level, that user is also removed from their workspaces, regardless of whether or not identity federated has been enabled. You should refrain from deleting account-level users or service principals unless you want them to lose access to all workspaces in the account. Be aware of the following consequences of deleting users:

  • Applications or scripts that use the tokens generated by the user will no longer be able to access the Databricks API

  • Jobs owned by the user will fail

  • Clusters owned by the user will stop

  • Queries or dashboards created by the user and shared using the Run as Owner credential will have to be assigned to a new owner to prevent sharing from failing

Workspace-local groups are not synced to the account level. Workspace-local groups are identified as workspace-local in the workspace admin console and (if identity federation is enabled for the workspace) on the workspace Permissions tab in the account console. For more information, see Special considerations for groups.

Special considerations for groups

While users and service principals created at the workspace level are automatically synchronized to the account, groups created at the workspace level are not. Instead, Databricks has the concept of account groups and workspace-local groups, with special behaviors:

  • Account groups can be created only by account admins. Account groups are available for assignment to identity-federated workspaces, and can be assigned to such workspaces by both account admins and workspace admins

  • Workspace-local groups can be created only by workspace admins. These groups are identified as workspace-local in the workspace admin console and on the workspace Permissions tab in the account console.

Databricks recommends against using workspace-local groups instead of account groups. You must enabled your workspace for identity federation in order to use account groups. If you are enabling an existing workspace for identity federation, you can use both account groups and workspace-local groups side-by-side, but Databricks recommends turning workspace-local groups into account groups to take advantage of centralized workspace assignment and data access management using Unity Catalog.

Assigning admin roles

Account admins can assign other users as account admins. They can also become Unity Catalog metastore admins by virtue of creating a metastore, and they can transfer the metastore admin role to another user or group.

Both account admins and workspace admins can assign other users as workspace admins. The workspace admin role is determined by membership in the workspace admins group, which is a default group in Databricks and cannot be deleted.

See:

Assigning entitlements

An entitlement is a property that allows a user, service principal, or group to interact with Databricks in a specified way. Entitlements are assigned to users at the workspace level. The following table lists entitlements and the workspace UI and API property name that you use to manage each one. You can use the workspace admin console and workspace-level SCIM REST APIs to manage entitlements.

Entitlement name (UI)

Entitlement name (API)

Default

Description

Workspace access

workspace-access

Granted by default.

When granted to a user or service principal, they can access the Data Science & Engineering and Databricks Machine Learning persona-based environments.

Can’t be removed from workspace admins.

Databricks SQL access

databricks-sql-access

Granted by default.

When granted to a user or service principal, they can access Databricks SQL.

Allow unrestricted cluster creation

allow-cluster-create

Not granted to users or service principals by default.

When granted to a user or service principal, they can create clusters. You can restrict access to existing clusters using cluster-level permissions.

Can’t be removed from workspace admins.

Allow pool creation (not available via UI)

allow-instance-pool-create

Can’t be granted to individual users or service principals.

When granted to a group, its members can create instance pools.

Can’t be removed from workspace admins.

Setting up single sign-on (SSO)

Single sign-on (SSO) enables you to authenticate your users using a third-party identity provider like Okta. If your identity provider supports the SAML 2.0 protocol (or, in the case of account-level SSO, the OIDC protocol), you can use Databricks SSO to integrate with your identity provider.

SSO for account-level and workspace-level identities must be managed separately.

See Set up single sign-on.