Skip to main content

Core concepts for attribute-based access control (ABAC)

This page introduces ABAC, how it uses governed tags, policies, and user-defined functions to control which rows users can see and how column values are presented, and its benefits. This page also covers the permissions required to set up ABAC and the separation of duties it enables across teams.

See Attribute-based access control in Unity Catalog for an overview of all ABAC topics, including tutorials, policy management, best practices, and limitations.

What is ABAC?

Attribute-based access control (ABAC) is a dynamic access control model where access decisions are based on policies evaluated against attributes associated with securables. In Unity Catalog, these attributes are represented through governed tags. These governed tags are used in policy conditions to match data objects within a given scope, such as a catalog or a schema. This allows a single policy to apply automatically across multiple data objects that meet its conditions.

For example, an ABAC policy might mask all columns tagged PII for tables within schemas tagged HR. As new data objects are created and tagged, the policy applies automatically without requiring separate policy definitions for each object.

ABAC supports row and column-level security through row filter policies and column mask policies on tables, materialized views, and streaming tables. Row filter policies restrict which rows a user can see. Column mask policies control how column values are presented to users.

For a comparison with table-level row filters and column masks, see When to use ABAC vs table-level row filters and column masks.

Governed tags

In Unity Catalog, attributes are implemented as governed tags. Governed tags are key-value pairs defined at the account level and applied to Unity Catalog securables, such as catalogs, schemas, tables, and columns, in addition to workspace objects. They represent characteristics such as sensitivity, classification, or business domain.

By default, tags inherit from parent catalogs and schemas to tables, but not from tables to columns. You can override inherited tags at any level, but column-level tags must be applied directly.

Governed tags hierarchy diagram

Governed tags can be referenced in policy conditions using built-in functions like has_tag() and has_tag_value(), which check whether a given tag is present on the target data object, either directly or through tag inheritance.

Governed tags are defined at the account level. This means that you can use the same tag taxonomy across your entire data estate in an account, including across multiple metastores.

For more information, see Governed tags and Apply tags to Unity Catalog securable objects.

Policies

Policies are attached to securable objects in Unity Catalog to define access control rules based on tag conditions. Below is an example:

SQL
CREATE FUNCTION mask_pii(val STRING) RETURNS STRING
RETURN '***';

CREATE POLICY mask_pii_for_hr
ON CATALOG catalog_a
COLUMN MASK mask_pii
TO `account users` EXCEPT `HR admins`
FOR TABLES
WHEN has_tag('HR')
MATCH COLUMNS has_tag('PII') AS pii_col
ON COLUMN pii_col;

Each policy specifies:

  • Scope: The securable where the policy is attached, specified by the ON clause. Attaching a policy to a securable means the policy conditions are evaluated for all objects of the type specified in the FOR clause, across that securable and all of its descendants.
    • Supported policy scopes today are CATALOG, SCHEMA, or TABLE.
    • Tables, including streaming tables and materialized views, are currently the only supported securable type, specified using the FOR TABLES clause.
    • A policy attached at a catalog evaluates against all tables in that catalog. A policy attached at a schema evaluates against all tables in that schema. A policy attached at a table evaluates only against that table.
note

Databricks recommends attaching policies at the highest applicable level, usually the catalog, to maximize governance efficiency. See Best practices for ABAC policies.

  • Principals: Who the policy applies to and who is exempt. The TO clause specifies the users, groups, or service principals subject to the policy. The optional EXCEPT clause excludes specific principals from this policy.
  • Actions: Whether the policy applies a row filter or a column mask. The action is implemented by a user-defined function (UDF) that defines the filtering or masking logic. See Policy types.
  • Conditions: Tag-based expressions that determine which tables or columns the policy targets. See Conditions and built-in functions.

Policies are created and managed through the UI or programmatically with SQL statements, such as CREATE POLICY, DROP POLICY, SHOW POLICIES, or DESCRIBE POLICY, REST APIs, Databricks SDKs, or Terraform. See Create and manage ABAC policies for the full syntax and examples.

Policy types

ABAC supports two policy types: row filter policies and column mask policies. Both require UDFs to implement the filtering or masking logic.

Row filter policies

Row filter policies restrict which rows a user can see in a table based on values in columns identified by tags that match the Conditions and built-in functions. The policy references a UDF that evaluates each row. Rows where the function returns FALSE are excluded from query results. Arguments are passed to the UDF through the USING COLUMNS clause.

Example use case: For a sales catalog, ensure the EMEA team sees only EMEA sale records across all tables that have a column tagged region.

SQL
CREATE FUNCTION filter_by_region(region STRING, allowed STRING) RETURNS BOOLEAN
RETURN region = allowed;

CREATE POLICY regional_access_emea
ON CATALOG sales
ROW FILTER filter_by_region
TO `emea team`
FOR TABLES
MATCH COLUMNS has_tag('region') AS rgn
USING COLUMNS (rgn, 'EMEA');

Column mask policies

Column mask policies control what values a user sees for specific columns identified by tags that match the Conditions and built-in functions. The policy references a UDF that takes the column value as input and returns the original value or a masked version. The masked column value is bound automatically as the first argument from the ON COLUMN clause, and additional arguments can be passed through USING COLUMNS. The return type must match or be castable to the column's data type.

Example use case: Mask SSN columns tagged with pii : ssn so that users see ***-**-XXXX (last four digits only) unless they are in a compliance group exempt from the policy.

SQL
CREATE FUNCTION mask_ssn(ssn STRING, show_last INT) RETURNS STRING
RETURN CONCAT('***-**-', RIGHT(ssn, show_last));

CREATE POLICY mask_ssn_columns
ON CATALOG hr_catalog
COLUMN MASK mask_ssn
TO `account users` EXCEPT `compliance team`
FOR TABLES
MATCH COLUMNS has_tag_value('pii', 'ssn') AS ssn_col
ON COLUMN ssn_col
USING COLUMNS (4);

The USING COLUMNS clause passes arguments to the UDF. It accepts aliases for columns that match a tag-based expression, or constant values (quoted strings, numeric literals, boolean values (TRUE/FALSE), or NULL), supplied in the order the function expects them. For column mask policies, these are additional arguments beyond the masked column (which is bound automatically from ON COLUMN). This allows a single UDF to be reused across policies with different parameters.

SQL UDFs are recommended for better performance. Python UDFs registered in Unity Catalog are also supported, though the query optimizer cannot inline or optimize them the way it can SQL UDFs. See Performance considerations for guidance on UDF language selection.

Conditions and built-in functions

Conditions are tag-based expressions that determine which tables and columns a policy targets within its scope.

  • Table conditions (WHEN clause): Boolean expressions that match tables based on their tags. If omitted, defaults to TRUE, meaning the policy applies to all tables in scope.
  • Column conditions (MATCH COLUMNS clause): One or more comma-separated boolean expressions that identify which columns the policy targets. Each expression can be a single built-in function like has_tag('pii'), or a combination using logical operators like has_tag_value('pii', 'ssn') AND has_tag('sensitive'). Each expression can be assigned an alias (specified after AS) that can be referenced in the ON COLUMN and USING COLUMNS clauses. A policy can include up to 3 column expressions, and all must match for the policy to apply.

Both clause types use the following built-in functions, evaluated by Unity Catalog against securable metadata:

Function

Context

Description

has_tag('tag_name')

Tables and columns

Returns true if the resource has the specified tag. In table conditions (WHEN), checks tags set directly on the table or inherited from a parent catalog or schema. In column conditions (MATCH COLUMNS), checks tags set directly on the column only — does not match table tags.

has_tag_value('tag_name', 'tag_value')

Tables and columns

Returns true if the resource has the specified tag with the specified value. Same context behavior as has_tag().

Tags don't propagate from tables to columns. Using has_tag() in a MATCH COLUMNS clause only matches column-level tags, not tags on the parent table or its ancestors.

note

The has_tag and has_tag_value functions use snake_case naming. The older camelCase forms (hasTag, hasTagValue) continue to work but aren't recommended. Databricks plans to deprecate camelCase forms when creating new policies. Existing policies aren't affected.

Example: using two column conditions. A customers schema has tables with an email column tagged pii : email and a consent column tagged consent_to_contact. The policy masks email addresses unless the customer has consented to be contacted. It uses two column conditions:

  1. has_tag_value('pii', 'email') identifies the column that contains email addresses (the column to mask).
  2. has_tag('consent_to_contact') identifies the column that contains consent information (used by the UDF to decide whether to mask).
SQL
CREATE FUNCTION mask_email_by_consent(email STRING, consent BOOLEAN)
RETURNS STRING
RETURN CASE
WHEN consent = true THEN email
ELSE '****@****.***'
END;

CREATE POLICY mask_email_with_consent
ON SCHEMA customers
COLUMN MASK mask_email_by_consent
TO `account users`
FOR TABLES
MATCH COLUMNS has_tag_value('pii', 'email') AS m,
has_tag('consent_to_contact') AS c
ON COLUMN m
USING COLUMNS (c);

This policy only applies to tables that have both a column tagged pii : email and a column tagged consent_to_contact. If a table does not have columns matching both conditions, the policy does not apply and the data is returned unmasked.

User-defined functions (UDFs)

Row filter and column mask policies use user-defined functions (UDFs) to implement their filtering or masking logic. See User-defined functions (UDFs) in Unity Catalog for how to create and manage UDFs, and Common patterns for row filtering and column masking for examples.

Separation of duties and permissions

Setting up ABAC involves several steps, each with its own permission requirements. Organizations can distribute these tasks across specialized groups depending on how they choose to separate duties. For example, an organization can define a tag taxonomy centrally, then have data stewards classify data, governance admins write policies, data creators create objects within governed scopes, and data consumers query the results.

Separation of duties for ABAC

  1. Create the tag taxonomy. Define the governed tag keys and their allowed values before anyone applies them or writes policies. For example, create a sensitivity tag with controlled values (public, internal, confidential, restricted) or a pii tag with values like ssn, email, and phone_number. See Standardize attributes and naming for recommendations on naming conventions and taxonomy design.

    • Required permissions: Account admin, or a user with CREATE permission for tags at the account level.
  2. Tag data assets. A data steward, data creator, or AI classification system applies governed tags to catalogs, schemas, tables, or columns. For example, tag columns that contain personally identifiable information with pii : ssn. Correct tagging is the essential first step for ABAC policies to apply.

    • Required permissions: ASSIGN on the tag, and APPLY TAG on the object.
warning

Tagging is a security boundary. If a user can change tags on a data asset, they can change which policies apply to it. Organizations should control who can apply tags and audit tag changes.

  1. Create a policy. A governance admin creates a policy at a scope, such as a catalog or schema. The policy specifies who it applies to, what conditions it evaluates, and the UDF that implements the filtering or masking logic.

    • Required permissions: MANAGE permission or object ownership on the securable where the policy is attached (catalog, schema, or table), and EXECUTE privilege on the UDF.
  2. Create data objects. Data creators create tables within the scopes to which they were granted access. New tables inherit tags from parent objects such as catalogs and schemas. Data creators also have APPLY TAG automatically on objects they create, so they can apply additional tags. Alternatively, they can rely on automatic data classification to handle tagging. If an organization relies on data creators to tag their own objects, it should establish clear tagging practices. Data creators do not need to configure any access controls if policies are set at higher levels, which Databricks recommends.

    • Required permissions: CREATE TABLE or other relevant creation privileges on the parent object.
  3. Query data. When a data consumer queries a table within the policy's scope, the policy evaluates automatically. If the table or columns match the policy's conditions and the user is not exempt, the consumer sees filtered or masked data.

    • Required permissions: Users must be granted permissions on the table, such as SELECT, through a direct object grant. ABAC row filter and column masking policies do not grant permissions on their own. They only filter records or mask columns for tables that a user can already access.

Benefits of ABAC

  • Reusable policies based on attributes: A single policy can apply to multiple data objects that match the same attribute-based conditions, rather than being tied to one specific object.

  • Automatic application to new objects: When new data objects are created within scope and tagged with the relevant attributes, existing ABAC policies apply without additional configuration. Policies act like future grants, which means that access controls apply automatically as new data is created and tagged appropriately.

  • Consistent enforcement within a scope: Policies attached at the catalog or schema level are evaluated dynamically against matching data objects in that scope, which removes differences in how similar data is filtered or masked.

  • Lower ongoing maintenance: Changes can be made by updating policy logic or governed tags, rather than revisiting each individual object as is required with table-level row filters and column masks.

  • Centralized governance: Because policies can be defined once and applied across many matching data objects, governance teams can manage controls across larger parts of the data estate with fewer policy definitions.

More information