Configure SCIM provisioning using Microsoft Azure Active Directory
This article describes how to set up provisioning to Databricks using Azure Active Directory.
You can set up provisioning to Databricks using Azure Active Directory (Azure AD) at the Databricks account level or at the Databricks workspace level.
Databricks recommends that you provision users, service principals, and groups to the account level and manage the assignment of users and groups to workspaces within Databricks. Your workspaces must be enabled for identity federation, in order to manage the assignment of users to workspaces. If you have any workspaces that are not enabled for identity federation, you should continue to provision users, service principals, and groups directly to those workspaces.
Provision identities to your Databricks account using Azure Active Directory (Azure AD)
You can sync account-level users and groups from your Azure Active Directory (Azure AD) tenant to Databricks using a SCIM provisioning connector.
.. important:: If you already have SCIM connectors that sync identities directly to your workspaces, you must disable those SCIM connectors when the account-level SCIM connector is enabled. See Migrate workspace-level SCIM provisioning to the account level.
Requirements
Your Databricks account must have the Premium plan or above.
You must have the Cloud Application Administrator role in Azure Active Directory.
Your Azure Active Directory account must be a Premium edition account to provision groups. Provisioning users is available for any Azure Active Directory edition.
You must be a Databricks account admin.
Step 1: Configure Databricks
As a Databricks account admin, log in to the Databricks account console.
Click
Settings.
Click User Provisioning.
Click Enable user provisioning.
Copy the SCIM token and the Account SCIM URL. You will use these to configure your Azure AD application.
Note
The SCIM token is restricted to the Account SCIM API /api/2.0/accounts/{account_id}/scim/v2/
and cannot be used to authenticate to other Databricks REST APIs.
Step 2: Configure the enterprise application
These instructions tell you how to create an enterprise application in the Azure portal and use that application for provisioning. If you have an existing enterprise application, you can modify it to automate SCIM provisioning using Microsoft Graph. This removes the need for a separate provisioning application in the Azure Portal.
Follow these steps to enable Azure AD to sync users and groups to your Databricks account. This configuration is separate from any configurations you have created to sync users and groups to workspaces.
In your Azure portal, go to Azure Active Directory > Enterprise Applications.
Click + New Application above the application list. Under Add from the gallery, search for and select Azure Databricks SCIM Provisioning Connector.
Enter a Name for the application and click Add.
Under the Manage menu, click Provisioning.
Set Provisioning Mode to Automatic.
Set the SCIM API endpoint URL to the Account SCIM URL that you copied earlier.
Set Secret Token to the Databricks SCIM token that you generated earlier.
Click Test Connection and wait for the message that confirms that the credentials are authorized to enable provisioning.
Click Save.
Step 3: Assign users and groups to the application
Users and groups assigned to the SCIM application will be provisioned to the Databricks account. If you have existing Databricks workspaces, Databricks recommends that you add all existing users and groups in those workspaces to the SCIM application.
Go to Manage > Provisioning.
Under Properties, set Assignment required to No. Databricks recommends this option, which allows all users to sign in to the Databricks account.
To start synchronizing Azure Active Directory users and groups to Databricks, click the Provisioning Status toggle.
Click Save.
Go to Manage > Users and groups.
Add some users and groups. Click Add user, select the users and groups, and click the Assign button.
Wait a few minutes and check that the users and groups exist in your Databricks account.
Users and groups that you add and assign will automatically be provisioned to the Databricks account when Azure Active Directory schedules the next sync.
Important
If you remove a user from the account-level SCIM application, that user is also deleted from the account and removed from their workspaces, regardless of whether or not identity federation has been enabled. We recommend that you refrain from removing account users unless you want them to lose access to all workspaces in the account.
Provision identities to your Databricks workspace using Azure Active Directory (Azure AD)
Preview
This feature is in Public Preview.
If you have any workspaces not enabled for identity federation, you should provision users, service principals, and groups directly to those workspaces. This section describes how to do this.
In the following examples, replace <databricks-instance>
with the workspace URL of your Databricks deployment.
Requirements
Your Databricks account must have the Premium plan or above.
You must have the Cloud Application Administrator role in Azure Active Directory.
Your Azure Active Directory account must be a Premium edition account to provision groups. Provisioning users is available for any Azure Active Directory edition.
You must be a Databricks workspace admin.
Step 1: Create the enterprise application and connect it to the Databricks SCIM API
To set up provisioning directly to Databricks workspaces using Azure Active Directory, you create an enterprise application for each Databricks workspace.
These instructions tell you how to create an enterprise application in the Azure portal and use that application for provisioning. If you have an existing enterprise application, you can modify it to automate SCIM provisioning using Microsoft Graph. This removes the need for a separate provisioning application in the Azure Portal.
As a workspace admin, log in to your Databricks workspace.
Generate a personal access token and copy it. You provide this token to Azure Active Directory in a subsequent step.
Important
Generate this token as a Databricks workspace admin who is not managed by the Azure Active Directory enterprise application. If the Databricks admin user who owns the personal access token is deprovisioned using Azure Active Directory, the SCIM provisioning application will be disabled.
In your Azure portal, go to Azure Active Directory > Enterprise Applications.
Click + New Application above the application list. Under Add from the gallery, search for and select Azure Databricks SCIM Provisioning Connector.
Enter a Name for the application and click Add. Use a name that will help administrators find it, like
<workspace-name>-provisioning
.Under the Manage menu, click Provisioning.
Set Provisioning Mode to Automatic.
Enter the SCIM API endpoint URL. Append
/api/2.0/preview/scim
to your workspace URL:https://<databricks-instance>/api/2.0/preview/scim
Replace
<databricks-instance>
with the workspace URL of your Databricks deployment. See Get identifiers for workspace objects.Set Secret Token to the Databricks personal access token that you generated in step 1.
Click Test Connection and wait for the message that confirms that the credentials are authorized to enable provisioning.
Optionally, enter a notification email to receive notifications of critical errors with SCIM provisioning.
Click Save.
Step 2: Assign users and groups to the application
Go to Manage > Provisioning.
Under Settings, set Scope to Sync only assigned users and groups.
Databricks recommends this option, which syncs only users and groups assigned to the enterprise application.
Note
Azure Active Directory does not support the automatic provisioning of nested groups to Databricks. Azure Active Directory can only read and provision users that are immediate members of the explicitly assigned group. As a workaround, explicitly assign (or otherwise scope in) the groups that contain the users who need to be provisioned. For more information, see this FAQ.
To start synchronizing Azure Active Directory users and groups to the Databricks workspace, click the Provisioning Status toggle.
Click Save.
Test your provisioning setup:
In your Azure Databricks SCIM Provisioning Connector, go to Manage > Users and groups.
Add some users and groups. Click Add user, select the users and groups, and click the Assign button.
Wait a few minutes and check that the users and groups exist in your Databricks workspace.
In the future, users and groups that you add and assign are automatically provisioned when Azure Active Directory schedules the next sync.
Important
Do not assign the Databricks workspace admin whose personal access token was used to configure the Azure Databricks SCIM Provisioning Connector application.
(Optional) Automate SCIM provisioning using Microsoft Graph
Microsoft Graph includes authentication and authorization libraries that you can integrate into your application to automate provisioning of users and groups to your Databricks account or workspaces, instead of configuring a SCIM provisioning connector application.
Follow the instructions for registering an application with Microsoft Graph. Make a note of the Application ID and the Tenant ID for the application
Go to the applications’s Overview page. On that page:
Configure a client secret for the application, and make a note of the secret.
Grant the application these permissions:
Application.ReadWrite.All
Application.ReadWrite.OwnedBy
Ask an Azure Active Directory administrator to grant admin consent.
Update your application’s code to add support for Microsoft Graph.
Provisioning tips
Users and groups that existed in the Databricks workspace prior to enabling provisioning exhibit the following behavior upon provisioning sync:
Are merged if they also exist in Azure Active Directory
Are ignored if they don’t exist in Azure Active Directory
User permissions that are assigned individually and are duplicated through membership in a group remain after the group membership is removed for the user.
Users removed from a Databricks workspace directly, using the Databricks workspace admin settings page:
Lose access to that Databricks workspace but may still have access to other Databricks workspaces.
Will not be synced again using Azure Active Directory provisioning, even if they remain in the enterprise application.
The initial Azure Active Directory sync is triggered immediately after you enable provisioning. Subsequent syncs are triggered every 20-40 minutes, depending on the number of users and groups in the application. See Provisioning summary report in the Azure Active Directory documentation.
You cannot update the username or email address of a Databricks workspace user.
The
admins
group is a reserved group in Databricks and cannot be removed.You can use the Databricks Groups API or the Groups UI to get a list of members of any Databricks workspace group.
You cannot sync nested groups or Azure Active Directory service principals from the Azure Databricks SCIM Provisioning Connector application. Databricks recommends that you use the enterprise application to sync users and groups and manage nested groups and service principals within Databricks. However, you can also use the Databricks Terraform provider or custom scripts that target the Databricks SCIM API in order to sync nested groups or Azure Active Directory service principals.
Troubleshooting
Users and groups do not sync
If you are using the Azure Databricks SCIM Provisioning Connector application:
For workspace-level provisioning: In the Databricks admin settings page, verify that the Databricks user whose personal access token is being used by the Azure Databricks SCIM Provisioning Connector application is still a workspace admin user in Databricks and that the token is still valid.
For account-level provisioning: In the account console verify that the Databricks SCIM token that was used to set up provisioning is still valid.
Do not attempt to sync nested groups, which are not supported by Azure Active Directory automatic provisioning. For more information, see this FAQ.
Azure Active Directory service principals do not sync
The Azure Databricks SCIM Provisioning Connector application does not support syncing service principals.
After initial sync, the users and groups stop syncing
If you are using the Azure Databricks SCIM Provisioning Connector application: After the initial sync, Azure Active Directory does not sync immediately after you change user or group assignments. It schedules a sync with the application after a delay, based on the number of users and groups. To request an immediate sync, go to Manage > Provisioning for the enterprise application and select Clear current state and restart synchronization.
Azure Active Directory provisioning service IP range not accessible
The Azure Active Directory provisioning service operates under specific IP ranges. If you need to restrict network access, you must allow traffic from the IP addresses for AzureActiveDirectory
in this IP range file. For more information, see IP Ranges.