This article explains how to create and manage policies in your workspace. For information on writing policy definitions, see Compute policy reference.
Policies require the Premium plan or above.
A policy is a tool workspace admins can use to limit a user or group’s compute creation permissions based on a set of policy rules.
Policies provide the following benefits:
Limit users to creating clusters with prescribed settings.
Limit users to creating a certain number of clusters.
Simplify the user interface and enable more users to create their own clusters (by fixing and hiding some values).
Control cost by limiting per cluster maximum cost (by setting limits on attributes whose values contribute to hourly price).
Enforce cluster-scoped library installations (Public Preview).
These are the basic instruction to create a policy. To learn how to define a policy, see Compute policy reference.
Click Compute in the sidebar.
Click the Policies tab.
Click Create policy.
Name the policy. Policy names are case insensitive.
Optionally, select a policy family from the Family dropdown. This determines the template from which you build the policy.
Enter a Description of the policy. This helps others know the purpose of the policy.
In the Definitions tab, enter a policy definition.
In the Libraries tab, add any compute-scoped libraries that you want the policy to install on the compute. See Add libraries to a policy.
In the Permissions tab, assign permissions for the policy and optionally set the maximum number of resources a user can create using that policy.
When you create a policy, you can choose to use a policy family. Policy families are Databricks-provide policy templates with pre-populated rules, designed to address common compute use cases.
When using a policy family, the rules for your policy are inherited from the policy family. After selecting a policy family, you can create the policy as-is, or choose to add rules or override the given rules. For more on policy families, see Default policies and policy families.
This feature is in Public Preview.
You can add libraries to a policy so libraries are automatically installed on compute resources. You can add a maximum of 500 libraries to a policy.
You may have previously added compute-scoped libraries using init scripts. Databricks recommends using compute policies instead of init scripts to install libraries.
To add a library to your policy:
At the bottom of the Create policy page, click the Libraries tab.
Click Add library.
Select one of the Library Source options, then follow the instructions as outlined below:
Select a workspace file or upload a Whl, zipped wheelhouse, JAR, ZIP, tar, or requirements.txt file.
Select a Whl or JAR file from a volume.
Select the library type and provide the full URI to the library object (for example:
Enter a PyPI package name. See PyPI package.
Specify a Maven coordinate. See Maven or Spark package.
Enter the name of a package. See CRAN package.
DBFS (Not recommended)
Load a JAR or Whl file to the DBFS root. This is not recommended, as files stored in DBFS can be modified by any workspace user.
If you add libraries to a policy:
Users can’t install or uninstall compute-scoped libraries on compute that uses this policy.
Libraries configured through the UI, REST API, or CLI on existing compute are removed the next time the compute restarts.
Dependency libraries for tasks that use this policy in jobs compute resources are disabled.
By default, workspace admins have permissions on all policies. Non-admin users must be granted permissions on a policy for them to have access to the policy.
If a user has unrestricted cluster creation permissions, then they will also have access to the Unrestricted policy. This allows them to create fully configurable compute resources.
If a user doesn’t have access to any policies, the policy dropdown does not display in their UI.
Policy permissions allow you to set a max number of compute resources per user. This determines how many resources a user can create using that policy. If the user exceeds the limit, the operation fails.
To restrict the number of resources a user can create using a policy, enter a value into the Max compute resources per user setting under the Permissions tab in the policies UI.
Databricks doesn’t proactively terminate resources to maintain the limit. If a user has three compute resources running with the policy and the workspace admin reduces the limit to one, the three resources will continue to run. Extra resources must be manually terminated to comply with the limit.
After you create a policy, you can edit, clone, and delete it.
You can also monitor the policy’s adoption by viewing the compute resources that use the policy. From the Policies page, click the policy you want to view. Then click the Compute or Jobs tabs to see a list of resources that use the policy.
You might want to edit a policy to update its permissions or its definitions. To edit a policy, select the policy from the Policies page then click Edit. After you click Edit you can click the Permissions tab to update the policy’s permissions. You can also then update the policy’s definition.
After you update a policy’s definitions, the compute that uses that policy does not automatically update to adhere to the new policy rules, but the policy rules will be in effect if the user attempts to edit the compute resource.
You can also use the cloning feature to create a new policy from an existing policy. Open the policy you want to clone then click the Clone button. Then change any values of the fields that you want to modify and click Create.