Data object privileges
The Databricks data governance model lets you programmatically grant, deny, and revoke access to your data from Spark SQL. This model lets you control access to securable objects like catalogs, schemas (databases), tables, views, and functions. It also allows for fine-grained access control (to a particular subset of a table, for example) by setting privileges on derived views created from arbitrary queries. The Databricks SQL query analyzer enforces these access control policies at runtime on Databricks clusters with table access control enabled and all SQL warehouses.
This article describes the privileges, objects, and ownership rules that make up the Databricks Hive metastore data governance model. It also describes how to grant, deny, and revoke object privileges.
Unity Catalog uses a different model for granting privileges. See Unity Catalog privileges and securable objects.
Requirements
The requirements for managing object privileges depends on your environment:
Databricks Data Science & Engineering and Databricks Machine Learning
An administrator must enable and enforce table access control for the workspace.
The cluster must be enabled for table access control.
Data governance model
This section describes the Databricks data governance model. Access to securable data objects is governed by privileges.
Securable objects
The securable objects are:
CATALOG
: controls access to the entire data catalog.SCHEMA
: controls access to a schema.TABLE
: controls access to a managed or external table.VIEW
: controls access to SQL views.FUNCTION
: controls access to a named function.
ANONYMOUS FUNCTION
: controls access to anonymous or temporary functions.Note
ANONYMOUS FUNCTION
objects are not supported in Databricks SQL.ANY FILE
: controls access to the underlying filesystem.Warning
Users granted access to
ANY FILE
can bypass the restrictions put on the catalog, schemas, tables, and views by reading from the filesystem directly.
Privileges
SELECT
: gives read access to an object.CREATE
: gives ability to create an object (for example, a table in a schema).MODIFY
: gives ability to add, delete, and modify data to or from an object.USAGE
: does not give any abilities, but is an additional requirement to perform any action on a schema object.READ_METADATA
: gives ability to view an object and its metadata.CREATE_NAMED_FUNCTION
: gives ability to create a named UDF in an existing catalog or schema.MODIFY_CLASSPATH
: gives ability to add files to the Spark class path.ALL PRIVILEGES
: gives all privileges (is translated into all the above privileges).
Note
The MODIFY_CLASSPATH
privilege is not supported in Databricks SQL.
USAGE
privilege
To perform an action on a schema object, a user must have the USAGE
privilege on that schema in addition to the privilege to perform that action. Any one of the following satisfy the USAGE
requirement:
Be an admin
Have the
USAGE
privilege on the schema or be in a group that has theUSAGE
privilege on the schemaHave the
USAGE
privilege on theCATALOG
or be in a group that has theUSAGE
privilegeBe the owner of the schema or be in a group that owns the schema
Even the owner of an object inside a schema must have the USAGE
privilege in order to use it.
As an example, an administrator could define a finance
group and an accounting
schema for them to use.
To set up a schema that only the finance team can use and share, an admin would do the following:
CREATE SCHEMA accounting;
GRANT USAGE ON SCHEMA accounting TO finance;
GRANT CREATE ON SCHEMA accounting TO finance;
With these privileges, members of the finance
group can create tables and views in the accounting
schema,
but can’t share those tables or views with any principal that does not have USAGE
on the accounting
schema.
Databricks Data Science & Engineering and Databricks Runtime version behavior
Clusters running Databricks Runtime 7.3 LTS and above enforce the
USAGE
privilege.Clusters running Databricks Runtime 7.2 and below do not enforce the
USAGE
privilege.To ensure that existing workloads function unchanged, in workspaces that used table access control before
USAGE
was introduced have had theUSAGE
privilege onCATALOG
granted to theusers
group. If you want to take advantage of theUSAGE
privilege, you must runREVOKE USAGE ON CATALOG FROM users
and thenGRANT USAGE ...
as needed.
Privilege hierarchy
When table access control is enabled on the workspace and on all clusters, SQL objects in Databricks are hierarchical and privileges are inherited downward. This means that granting or denying a privilege on the CATALOG
automatically grants or denies the privilege to all schemas in the catalog. Similarly, privileges granted on a schema object are inherited by all objects in that schema. This pattern is true for all securable objects.
If you deny a user privileges on a table, the user can’t see the table by attempting to list all tables in the schema. If you deny a user privileges on a schema, the user can’t see that the schema exists by attempting to list all schemas in the catalog.
Object ownership
When table access control is enabled on a cluster or SQL warehouse, a user who creates a schema, table, view, or function becomes its owner. The owner is granted all privileges and can grant privileges to other users.
Groups may own objects, in which case all members of that group are considered owners.
Ownership determines whether or not you can grant privileges on derived objects to other users.
For example, suppose user A owns table T and grants user B SELECT
privilege on table T. Even
though user B can select from table T, user B cannot grant SELECT
privilege on table T to user C,
because user A is still the owner of the underlying table T. Furthermore, user B cannot circumvent
this restriction simply by creating a view V on table T and granting privileges on that view to
user C. When Databricks checks for privileges for user C to access view V, it also checks that
the owner of V and underlying table T are the same. If the owners are not the same, user C must
also have SELECT
privileges on underlying table T.
When table access control is disabled on a cluster, no owner is registered when a schema, table, view,
or function is created. To test if an object has an owner, run SHOW GRANTS ON <object-name>
.
If you do not see an entry with ActionType OWN
, the object does not have an owner.
Assign owner to object
Either the owner of an object or an administrator can transfer ownership of an object using the ALTER <object> OWNER TO `<user-name>@<user-domain>.com`
command:
ALTER SCHEMA <schema-name> OWNER TO `<user-name>@<user-domain>.com`
ALTER TABLE <table-name> OWNER TO `group_name`
ALTER VIEW <view-name> OWNER TO `<user-name>@<user-domain>.com`
ALTER FUNCTION <function-name> OWNER TO `<user-name>@<user-domain>.com`
Users and groups
Administrators and owners can grant privileges to users and groups.
Each user is uniquely identified by their username in Databricks (which typically maps to their email address).
All users are implicitly a part of the “All Users” group, represented as users
in SQL.
Note
You must enclose user specifications in backticks ( ` `
), not single quotes (' '
).
Operations and privileges
In Databricks, admin users can manage all object privileges, effectively have all privileges granted on all securables, and can change the owner of any object. Owners of an object can perform any action on that object, can grant privileges on that object to other principals, and can transfer ownership of the object to another principal. The only limit to an owner’s privileges is for objects within a schema; to interact with an object in a schema the user must also have USAGE
on that schema.
The following table maps SQL operations to the privileges required to perform that operation.
Note
Any place where a privilege on a table, view, or function is required,
USAGE
is also required on the schema it’s in.In any place where a table is referenced in a command, a path could also be referenced. In those instances
SELECT
orMODIFY
is required onANY FILE
instead ofUSAGE
on the schema and another privilege on the table.Object ownership is represented here as the
OWN
privilege.
Operation |
Required Privileges |
---|---|
Ability to |
|
|
|
|
|
|
|
Either |
|
Either |
|
Either |
|
Either |
|
|
|
|
|
Usually |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Important
When you use table access control, DROP TABLE
statements are case sensitive. If a table name is lower case and the DROP TABLE
references the table name using mixed or upper case, the DROP TABLE
statement will fail.
Manage object privileges
You use the GRANT
, DENY
, REVOKE
, MSCK
, and SHOW GRANTS
operations to manage object privileges.
Note
An owner or an administrator of an object can perform
GRANT
,DENY
,REVOKE
, andSHOW GRANTS
operations. However, an administrator cannot deny privileges to or revoke privileges from an owner.A principal that’s not an owner or administrator can perform an operation only if the required privilege has been granted.
To grant, deny, or revoke a privilege for all users, specify the keyword
users
afterTO
. For example,GRANT SELECT ON ANY FILE TO users
Examples
GRANT SELECT ON SCHEMA <schema-name> TO `<user>@<domain-name>`
GRANT SELECT ON ANONYMOUS FUNCTION TO `<user>@<domain-name>`
GRANT SELECT ON ANY FILE TO `<user>@<domain-name>`
SHOW GRANTS `<user>@<domain-name>` ON SCHEMA <schema-name>
DENY SELECT ON <table-name> TO `<user>@<domain-name>`
REVOKE ALL PRIVILEGES ON SCHEMA default FROM `<user>@<domain-name>`
REVOKE SELECT ON <table-name> FROM `<user>@<domain-name>`
GRANT SELECT ON ANY FILE TO users
Dynamic view functions
Databricks includes two user functions that allow you to express column- and row-level permissions dynamically in the body of a view definition.
current_user()
: return the current user name.is_member()
: determine if the current user is a member of a specific Databricks group.
Note
Available in Databricks Runtime 7.3 LTS and above. However, to use these functions in Databricks Runtime 7.3 LTS, you must set the Spark config spark.databricks.userInfoFunctions.enabled true
.
Consider the following example, which combines both functions to determine if a user has the appropriate group membership:
-- Return: true if the user is a member and false if they are not
SELECT
current_user as user,
-- Check to see if the current user is a member of the "Managers" group.
is_member("Managers") as admin
Allowing administrators to set fine granularity privileges for multiple users and groups within a single view is both expressive and powerful, while saving on administration overhead.
Column-level permissions
Through dynamic views it’s easy to limit what columns a specific group or user can see. Consider the following example where only users who belong to the auditors
group are able to see email addresses from the sales_raw
table. At analysis time Spark replaces the CASE
statement with either the literal 'REDACTED'
or the column email
. This behavior allows for all the usual performance optimizations provided by Spark.
-- Alias the field 'email' to itself (as 'email') to prevent the
-- permission logic from showing up directly in the column name results.
CREATE VIEW sales_redacted AS
SELECT
user_id,
CASE WHEN
is_member('auditors') THEN email
ELSE 'REDACTED'
END AS email,
country,
product,
total
FROM sales_raw
Row-level permissions
Using dynamic views you can specify permissions down to the row or field level. Consider the following example, where only users who belong to the managers
group are able to see transaction amounts (total
column) greater than $1,000,000.00:
CREATE VIEW sales_redacted AS
SELECT
user_id,
country,
product,
total
FROM sales_raw
WHERE
CASE
WHEN is_member('managers') THEN TRUE
ELSE total <= 1000000
END;
Data masking
As shown in the preceding examples, you can implement column-level masking to prevent users from seeing specific column data unless they are in the correct group. Because these views are standard Spark SQL, you can do more advanced types of masking with more complex SQL expressions. The following example lets all users perform analysis on email domains, but lets members of the auditors
group see users’ full email addresses.
-- The regexp_extract function takes an email address such as
-- user.x.lastname@example.com and extracts 'example', allowing
-- analysts to query the domain name
CREATE VIEW sales_redacted AS
SELECT
user_id,
region,
CASE
WHEN is_member('auditors') THEN email
ELSE regexp_extract(email, '^.*@(.*)$', 1)
END
FROM sales_raw
Frequently asked questions (FAQ)
How do I grant, deny, or revoke a privilege for all users?
Specify the keyword users
after TO
or FROM
. For example:
GRANT SELECT ON TABLE <schema-name>.<table-name> TO users
I created an object but now I can’t query, drop, or modify it.
This error can occur because you created that object on a cluster or SQL warehouse without table access control enabled. When table access control is disabled on a cluster or SQL warehouse, owners are not registered when a schema, table, or view is created. An admin must assign an owner to the object using the following command:
ALTER [SCHEMA | TABLE | VIEW] <object-name> OWNER TO `<user-name>@<user-domain>.com`;
How do I grant privileges on global and local temporary views?
Privileges on global and local temporary views are not supported. Local temporary
views are visible only within the same session, and views created in the global_temp
schema are
visible to all users sharing a cluster or SQL warehouse. However, privileges on the underlying tables and views
referenced by any temporary views are enforced.
How do I grant a user or group privileges on multiple tables at once?
A grant, deny, or revoke statement can be applied to only one object at a time. The recommended way
to organize and grant privileges on multiple tables to a principal is via schemas. Granting a
principal SELECT
privilege on a schema implicitly grants that principal SELECT
privileges on
all tables and views in that schema. For example, if a schema D has tables t1 and t2, and an
admin issues the following GRANT
command:
GRANT USAGE, SELECT ON SCHEMA D TO `<user>@<domain-name>`
The principal <user>@<domain-name>
can select from tables t1 and t2, as well as any tables and views created in schema D in the future.
How do I grant a user privileges on all tables except one?
You grant SELECT
privilege to the schema and then deny SELECT
privilege for the specific table you want to restrict access to.
GRANT USAGE, SELECT ON SCHEMA D TO `<user>@<domain-name>`
DENY SELECT ON TABLE D.T TO `<user>@<domain-name>`
The principal <user>@<domain-name>
can select from all tables in D except D.T.
A user has SELECT
privileges on a view of table T, but when that user tries to SELECT
from that view, they get the error User does not have privilege SELECT on table
.
This common error can occur for one of the following reasons:
Table T has no registered owner because it was created using a cluster or SQL warehouse for which table access control is disabled.
The grantor of the
SELECT
privilege on a view of table T is not the owner of table T or the user does not also have selectSELECT
privilege on table T.
Suppose there is a table T owned by A. A owns view V1 on T and B owns view V2 on T.
A user can select on V1 when A has granted
SELECT
privileges on view V1.A user can select on V2 when A has granted
SELECT
privileges on table T and B has grantedSELECT
privileges on V2.
As described in the Object ownership section, these conditions ensure that only the owner of an object can grant other users access to that object.
I tried to run sc.parallelize
on a cluster with table access control enabled and it fails.
On clusters with table access control enabled you can use only the Spark SQL and Python DataFrame APIs. The RDD API is disallowed for security reasons, since Databricks does not have the ability to inspect and authorize code within an RDD.
I want to manage permissions from infrastructure-as-code
You can manage table access control in a fully automated setup using Databricks Terraform provider and databricks_sql_permissions:
resource "databricks_sql_permissions" "foo_table" {
table = "foo"
privilege_assignments {
principal = "serge@example.com"
privileges = ["SELECT", "MODIFY"]
}
privilege_assignments {
principal = "special group"
privileges = ["SELECT"]
}
}