Single sign-on (SSO) enables you to authenticate your users using your organization’s identity provider. If your identity provider supports the SAML 2.0 protocol, you can use Databricks SSO to integrate with your identity provider.
You can choose how users in your organization get access to Databricks through one of two ways:
- Add users through your identity provider and enable auto user creation. If the user’s account does not already exist, a new account will be provisioned for them upon login.
- Manually add users in Databricks as described in Manage users and disable auto user creation. If the user’s account does not already exist in Databricks, they cannot log in.
Go to the Admin Console and select the SSO tab.
Go to your identity provider and create a Databricks application with the information in the Databricks SAML URL field.
You can read the instructions on how to set this up for:
- AWS single sign-on (SSO)
- Microsoft Windows Active Directory
- GSuite single sign-on (SSO)
- Okta single sign-on (SSO)
- OneLogin single sign-on (SSO)
- Ping Identity single sign-on (SSO)
The process will be similar for any identity provider that supports SAML 2.0.
In the Provide the information from the identity provider field, paste in information from your identity provider in the Databricks SSO.
If you want to enable automatic user creation, select Allow auto user creation.
If you are configuring Secure access to S3 buckets using IAM credential passthrough with SAML 2.0 federation, select Allow IAM role entitlement auto sync.
Click Enable SSO.
Once enabled, you will see the new option in the sign in page to use the single sign-on option.
- Non-admin users can sign in only using SSO.
- Admin users can sign in with either SSO or their username and password. If there are difficulties signing in with SSO, the admins can sign in with password and disable SSO in the Admin Console.
- API users can still use their username and password to make REST API calls.
If a user’s current email address (username) with Databricks is the same as in the identity provider, then the migration will be automatic (as long as auto-user creation is enabled) and you can skip this step.
If a user’s email address with the identity provider is different than the one with Databricks, then a new user based on the identity provider email will appear in Databricks when they login. Since non-admin users will no longer be able to login with their old email address and password, they will not be able to access the files in their existing Users folder.
We recommended the following steps to migrate files from their old Users folder to their new Users folder:
An admin can remove the old user. This marks the user’s folder directory as defunct and the directory will move below all the active users in the workspace. All the notebooks and libraries will still be accessible by admins. All the clusters and jobs created by the user will remain as is. If the user had any other ACLs set, enabling SSO will cause those to reset and the admin have to manually set those ACLs for the new user.
The admin can then move the old user’s folder into the new one as shown below.