Hive metastore privileges and securable objects (legacy)
This article describes the privilege model for the legacy Databricks Hive metastore, which is built in to each Databricks workspace. It also describes how to grant, deny, and revoke privileges for objects in the built-in Hive metastore. Unity Catalog uses a different model for granting privileges. See Unity Catalog privileges and securable objects.
Table access control for data managed by the Hive metastore is a legacy data governance model. Databricks recommends that you upgrade the tables managed by the Hive metastore to the Unity Catalog metastore. Unity Catalog simplifies security and governance of your data by providing a central place to administer and audit data access across multiple workspaces in your account. To learn more about how the legacy privilege model differs from the Unity Catalog privilege model, see Work with Unity Catalog and the legacy Hive metastore.
An administrator must enable and enforce table access control for the workspace.
The cluster must be enabled for table access control.
Data access control is always enabled in Databricks SQL even if table access control is not enabled for the workspace.
If table access control is enabled for the workspace and you have already specified ACLs (granted and denied privileges) in the workspace, those ACLs are respected in Databricks SQL.
Manage privileges on objects in the Hive metastore
Privileges on data objects managed by the Hive metastore can be granted by either a workspace admin or the owner of an object. You can manage privileges for Hive metastore objects by using SQL commands.
To manage privileges in SQL, you use GRANT, REVOKE, DENY, MSCK, and SHOW GRANTS statements in a notebook or the Databricks SQL query editor, using the syntax:
GRANT privilege_type ON securable_object TO principal
privilege_typeis a Hive metastore privilege type
securable_objectis a securable object in the Hive metastore
principalis a user, service principal (represented by its applicationId value), or group. You must enclose users, service principals, and group names with special characters in backticks (
` `). See Principal.
For more information about managing privileges for objects in the Hive metastore, see Privileges and securable objects in the Hive metastore
When table access control is enabled on a cluster or SQL warehouse, a user who creates a schema, table, view, or function becomes its owner. The owner is granted all privileges and can grant privileges to other users.
Groups may own objects, in which case all members of that group are considered owners.
Either the owner of an object or a workspace admin can transfer ownership of an object using the
ALTER <object> OWNER TO `<user-name>@<user-domain>.com` command.
Securable objects in the Hive metastore
The securable objects are:
CATALOG: controls access to the entire data catalog.
SCHEMA: controls access to a schema.
TABLE: controls access to a managed or external table.
VIEW: controls access to SQL views.
FUNCTION: controls access to a named function.
ANONYMOUS FUNCTION: controls access to anonymous or temporary functions.
ANONYMOUS FUNCTIONobjects are not supported in Databricks SQL.
ANY FILE: controls access to the underlying filesystem.
Users granted access to
ANY FILEcan bypass the restrictions put on the catalog, schemas, tables, and views by reading from the filesystem directly.
Privileges you can grant on Hive metastore objects
SELECT: gives read access to an object.
CREATE: gives ability to create an object (for example, a table in a schema).
MODIFY: gives ability to add, delete, and modify data to or from an object.
USAGE: does not give any abilities, but is an additional requirement to perform any action on a schema object.
READ_METADATA: gives ability to view an object and its metadata.
CREATE_NAMED_FUNCTION: gives ability to create a named UDF in an existing catalog or schema.
MODIFY_CLASSPATH: gives ability to add files to the Spark class path.
ALL PRIVILEGES: gives all privileges (is translated into all the above privileges).
MODIFY_CLASSPATH privilege is not supported in Databricks SQL.
To perform an action on a schema object in the Hive metastore, a user must have the
USAGE privilege on that schema in addition to the privilege to perform that action. Any one of the following satisfies the
Be a workspace admin
USAGEprivilege on the schema or be in a group that has the
USAGEprivilege on the schema
USAGEprivilege on the
CATALOGor be in a group that has the
Be the owner of the schema or be in a group that owns the schema
Even the owner of an object inside a schema must have the
USAGE privilege in order to use it.
When table access control is enabled on the workspace and on all clusters, SQL objects in Databricks are hierarchical and privileges are inherited downward. This means that granting or denying a privilege on the
CATALOG automatically grants or denies the privilege to all schemas in the catalog. Similarly, privileges granted on a schema object are inherited by all objects in that schema. This pattern is true for all securable objects.
If you deny a user privileges on a table, the user can’t see the table by attempting to list all tables in the schema. If you deny a user privileges on a schema, the user can’t see that the schema exists by attempting to list all schemas in the catalog.
Dynamic view functions
Databricks includes two user functions that allow you to express column- and row-level permissions dynamically in the body of a view definition that is managed by the Hive metastore.
current_user(): return the current user name.
is_member(): determine if the current user is a member of a specific Databricks group.
To use these functions in Databricks Runtime 7.3 LTS, you must set the Spark config
spark.databricks.userInfoFunctions.enabled true. They are enabled by default in all supported Databricks Runtime versions above 7.3.
The following example combines both functions to determine if a user has the appropriate group membership:
-- Return: true if the user is a member and false if they are not SELECT current_user as user, -- Check to see if the current user is a member of the "Managers" group. is_member("Managers") as admin
You can use dynamic views to limit the columns a specific group or user can see. Consider the following example where only users who belong to the
auditors group are able to see email addresses from the
sales_raw table. At analysis time Spark replaces the
CASE statement with either the literal
'REDACTED' or the column
-- Alias the field 'email' to itself (as 'email') to prevent the -- permission logic from showing up directly in the column name results. CREATE VIEW sales_redacted AS SELECT user_id, CASE WHEN is_member('auditors') THEN email ELSE 'REDACTED' END AS email, country, product, total FROM sales_raw
Using dynamic views you can specify permissions down to the row or field level. Consider the following example, where only users who belong to the
managers group are able to see transaction amounts (
total column) greater than $1,000,000.00:
CREATE VIEW sales_redacted AS SELECT user_id, country, product, total FROM sales_raw WHERE CASE WHEN is_member('managers') THEN TRUE ELSE total <= 1000000 END;
As shown in the preceding examples, you can implement column-level masking to prevent users from seeing specific column data unless they are in the correct group. Because these views are standard Spark SQL, you can do more advanced types of masking with more complex SQL expressions. The following example lets all users perform analysis on email domains, but lets members of the
auditors group see users’ full email addresses.
-- The regexp_extract function takes an email address such as -- email@example.com and extracts 'example', allowing -- analysts to query the domain name CREATE VIEW sales_redacted AS SELECT user_id, region, CASE WHEN is_member('auditors') THEN email ELSE regexp_extract(email, '^.*@(.*)$', 1) END FROM sales_raw