Skip to main content

Manually apply row filters and column masks

This page provides guidance and examples for using row filters, column masks, and mapping tables to filter sensitive data in your tables. These features require Unity Catalog.

If you're looking for a centralized tag-based approach to filtering and masking, see Unity Catalog attribute-based access control (ABAC). ABAC enables you to manage policies using governed tags and apply them consistently across many tables.

Before you begin

To add row filters and column masks to tables, you must have:

You must also meet the following requirements:

  • To assign a function that adds row filters or column masks to a table, you must have the EXECUTE privilege on the function, USE SCHEMA on the schema, and USE CATALOG on the parent catalog.
  • If you are adding filters or masks when you create a new table, you must have the CREATE TABLE privilege on the schema.
  • If you are adding filters or masks to an existing table, you must be the table owner or have the MANAGE privilege on the table.

To access a table that has row filters or column masks, your compute resource must meet one of these requirements:

  • A SQL warehouse.
  • Standard access mode (formerly shared access mode) on Databricks Runtime 12.2 LTS or above.
  • Dedicated access mode (formerly single user access mode) on Databricks Runtime 15.4 LTS or above.

You cannot read row filters or column masks using dedicated compute on Databricks Runtime 15.3 or below.

To take advantage of the data filtering provided in Databricks Runtime 15.4 LTS and above, you must also verify that your workspace is enabled for serverless compute, because the data filtering functionality that supports row filters and column masks runs on serverless compute. You might be charged for serverless compute resources when you use compute configured as dedicated access mode to read tables that use row filters or column masks. Write operations to these tables are only supported on DBR 16.3 and above, and must use supported patterns such as MERGE INTO. See Fine-grained access control on dedicated compute.

note

Row filters and column masks are retained when replacing a table.

If you run REPLACE TABLE, any existing row filter is retained regardless of schema changes. Column masks are also retained if the new table includes columns with the same names as those that had masks in the original table. In both cases, the policies are preserved even if they are not explicitly redefined. This prevents accidental loss of data access policies.

However, if a retained policy references a column that was removed or changed, subsequent queries might fail. To resolve this, update or drop the policy using ALTER TABLE.

Apply a row filter

To create a row filter, you write a function (UDF) to define the filter policy and then apply it to a table. Each table can have only one row filter. A row filter accepts zero or more input parameters where each input parameter binds to one column of the corresponding table.

You can apply a row filter using Catalog Explorer or SQL commands. The Catalog Explorer instructions assume that you have already created a function and registered it in Unity Catalog. The SQL instructions include examples of creating a row filter function and applying it to a table.

note

If you're using Lakeflow Declarative Pipelines, you can use the Lakeflow Declarative Pipelines Python API to create streaming tables or materialized views that use row filters and column masks. See Publish tables with row filters and column masks.

  1. In your Databricks workspace, click Data icon. Catalog.
  2. Browse or search for the table you want to filter.
  3. On the Overview tab, under Row filter, click Add filter.
  4. On the Add row filter dialog, select the catalog and schema that contain the filter function, then select the function.
  5. On the expanded dialog, view the function definition and select the table columns that match the columns included in the function statement.
  6. Click Add.

To remove the filter from the table, click fx Row filter and click Remove.

Row filter examples

This example creates a SQL user-defined function that applies to members of the group admin in the region US.

When this sample function is applied to the sales table, members of the admin group can access all records in the table. If the function is called by a non-admin, the RETURN_IF condition fails and the region='US' expression is evaluated, filtering the table to only show records in the US region.

SQL
CREATE FUNCTION us_filter(region STRING)
RETURN IF(IS_ACCOUNT_GROUP_MEMBER('admin'), true, region='US');

Apply the function to a table as a row filter. Subsequent queries from the sales table then return a subset of rows.

SQL
CREATE TABLE sales (region STRING, id INT);
ALTER TABLE sales SET ROW FILTER us_filter ON (region);

Disable the row filter. Future user queries from the sales table then return all of the rows in the table.

SQL
ALTER TABLE sales DROP ROW FILTER;

Create a table with the function applied as a row filter as part of the CREATE TABLE statement. Future queries from the sales table then each return a subset of rows.

SQL
CREATE TABLE sales (region STRING, id INT)
WITH ROW FILTER us_filter ON (region);

Apply a column mask

To apply a column mask, create a function (UDF) and apply it to a table column.

You can apply a column mask using Catalog Explorer or SQL commands. The Catalog Explorer instructions assume that you have already created a function and registered it in Unity Catalog. The SQL instructions include examples of creating a column mask function and applying it to a table column.

note

If you're using Lakeflow Declarative Pipelines, you can use the Lakeflow Declarative Pipelines Python API to create streaming tables or materialized views that use row filters and column masks. See Publish tables with row filters and column masks.

  1. In your Databricks workspace, click Data icon. Catalog.
  2. Browse or search for the table.
  3. On the Overview tab, find the row you want to apply the column mask to and click the Edit icon Mask edit icon.
  4. On the Add column mask dialog, select the catalog and schema that contain the filter function, then select the function.
  5. On the expanded dialog, view the function definition. If the function includes any parameters in addition to the column being masked, select the table columns in which you want to cast those additional function parameters.
  6. Click Add.

To remove the column mask from the table, click fx Column mask in the table row and click Remove.

Column mask examples

In this example, you create a user-defined function that masks the ssn column so that only users who are members of the HumanResourceDept group can view values in that column.

SQL
CREATE FUNCTION ssn_mask(ssn STRING)
RETURN CASE WHEN is_account_group_member('HumanResourceDept') THEN ssn ELSE '***-**-****' END;

Apply the new function to a table as a column mask. You can add the column mask when you create the table or later.

SQL
--Create the `users` table and apply the column mask in a single step:

CREATE TABLE users (
name STRING,
ssn STRING MASK ssn_mask);
SQL
--Create the `users` table and apply the column mask after:

CREATE TABLE users
(name STRING, ssn STRING);

ALTER TABLE users ALTER COLUMN ssn SET MASK ssn_mask;

Queries on that table now return masked ssn column values when the querying user is not a member of the HumanResourceDept group:

SQL
SELECT * FROM users;
James ***-**-****

To disable the column mask so that queries return the original values in the ssn column:

SQL
ALTER TABLE users ALTER COLUMN ssn DROP MASK;

Use mapping tables to create an access-control list

To achieve row-level security, consider defining a mapping table (or access-control list). A comprehensive mapping table encodes which data rows in the original table are accessible to certain users or groups. Mapping tables are useful because they offer simple integration with your fact tables through direct joins.

This methodology addresses many use cases that include custom requirements. Examples include:

  • Imposing restrictions based on the logged-in user while accommodating different rules for specific user groups.
  • Creating intricate hierarchies, such as organizational structures, that require diverse sets of rules.
  • Replicating complex security models from external source systems.

By adopting mapping tables, you can accomplish these challenging scenarios and ensure robust row-level and column-level security implementations.

Mapping table examples

Use a mapping table to check if the current user is in a list:

SQL
USE CATALOG main;

Create a new mapping table:

SQL
DROP TABLE IF EXISTS valid_users;

CREATE TABLE valid_users(username string);
INSERT INTO valid_users
VALUES
('fred@databricks.com'),
('barney@databricks.com');

Create a new filter:

note

All filters run with definer's rights except for functions that check user context (for example, the CURRENT_USER and IS_ACCOUNT_GROUP_MEMBER functions) which run as the invoker.

In this example, the function checks whether the current user is in the valid_users table. If the user is found, the function returns true.

SQL
DROP FUNCTION IF EXISTS row_filter;

CREATE FUNCTION row_filter()
RETURN EXISTS(
SELECT 1 FROM valid_users v
WHERE v.username = CURRENT_USER()
);

The example below applies the row filter during table creation. You can also add the filter later using an ALTER TABLE statement. When applying the filter to unspecified columns, use the ON () syntax. For a specific column, use ON (column);. For more details, see Parameters.

SQL
DROP TABLE IF EXISTS data_table;

CREATE TABLE data_table
(x INT, y INT, z INT)
WITH ROW FILTER row_filter ON ();

INSERT INTO data_table VALUES
(1, 2, 3),
(4, 5, 6),
(7, 8, 9);

Select data from the table. This should only return data if the user is in the valid_users table.

SQL
SELECT * FROM data_table;

Create a mapping table comprising accounts that should always have access to view all the rows in the table, regardless of the column values:

SQL
CREATE TABLE valid_accounts(account string);
INSERT INTO valid_accounts
VALUES
('admin'),
('cstaff');

Now, create a SQL UDF that returns true if the values of all columns in the row are less than five or if the invoking user is a member of the above mapping table.

SQL
CREATE FUNCTION row_filter_small_values (x INT, y INT, z INT)
RETURN (x < 5 AND y < 5 AND z < 5)
OR EXISTS(
SELECT 1 FROM valid_accounts v
WHERE IS_ACCOUNT_GROUP_MEMBER(v.account));

Finally, apply the SQL UDF to the table as a row filter:

SQL
ALTER TABLE data_table SET ROW FILTER row_filter_small_values ON (x, y, z);