Create views
Preview
Unity Catalog is in Public Preview. To participate in the preview, contact your Databricks representative.
This article shows how to create views in Unity Catalog. A view is a read-only object that joins records from multiple tables and views.
Requirements
In Databricks, you must be an account admin.
Your Databricks account must be on the Premium plan.
If necessary, create a metastore.
The metastore must be linked to the workspace where you run the commands to create the catalog or schema.
You can create a catalog or schema from a Data Science & Engineering notebook or a Databricks SQL endpoint that uses a compute resource that is compliant with Unity Catalog’s security requirements.
If necessary, create a catalog and schema in the metastore. The catalog and schema will contain the new table. To create a schema, you must have the
USAGE
andCREATE
data permissions on the schema’s parent catalog. See Create and manage catalogs and Create and manage schemas (databases).If necessary, create tables that will compose the view.
You can create a view from a Data Science & Engineering notebook or a Databricks SQL endpoint that uses a compute resource that is compliant with Unity Catalog’s security requirements.
The parent catalog and schema must exist. When you create a metastore, it contains a catalog called
main
with an empty schema calleddefault
. See Create and manage catalogs and Create and manage schemas (databases).You must have the
USAGE
permission on the parent catalog and theUSAGE
andCREATE
permissions on the parent schema.You must have
SELECT
access on all tables and views referenced in the view.Note
To read from a view from a cluster with Single User cluster security mode, you must have
SELECT
on all referenced tables and views.
Create a view
To create a view, run the following SQL command. Items in brackets are optional. Replace the placeholder values:
<catalog_name>
: The name of the catalog.<schema_name>
: The name of the schema.<view_name>
: A name for the view.<query>
: The query, columns, and tables and views used to compose the view.
CREATE VIEW <catalog_name>.<schema_name>.<view_name> AS
SELECT <query>;
spark.sql("CREATE VIEW <catalog_name>.<schema_name>.<view_name> AS "
"SELECT <query>")
library(SparkR)
sql(paste("CREATE VIEW <catalog_name>.<schema_name>.<view_name> AS ",
"SELECT <query>",
sep = ""))
spark.sql("CREATE VIEW <catalog_name>.<schema_name>.<view_name> AS " +
"SELECT <query>")
For example, to create a view named sales_redacted
from columns in the sales_raw
table:
CREATE VIEW sales_metastore.sales.sales_redacted AS
SELECT
user_id,
email,
country,
product,
total
FROM sales_metastore.sales.sales_raw;
spark.sql("CREATE VIEW sales_metastore.sales.sales_redacted AS "
"SELECT "
" user_id, "
" email, "
" country, "
" product, "
" total "
"FROM sales_metastore.sales.sales_raw")
library(SparkR)
sql(paste("CREATE VIEW sales_metastore.sales.sales_redacted AS ",
"SELECT ",
" user_id, ",
" email, ",
" country, ",
" product, ",
" total ",
"FROM sales_metastore.sales.sales_raw",
sep = ""))
spark.sql("CREATE VIEW sales_metastore.sales.sales_redacted AS " +
"SELECT " +
" user_id, " +
" email, " +
" country, " +
" product, " +
" total " +
"FROM sales_metastore.sales.sales_raw")
You can also create a view by using the Databricks Terraform provider and databricks_table. You can retrieve a list of view full names by using databricks_views.
Create a dynamic view
In Unity Catalog, you can use dynamic views to configure fine-grained access control, including:
Security at the level of columns or rows.
Data masking.
Note
Fine-grained access control using dynamic views are not available on clusters with Single User security mode.
Unity Catalog introduces the following functions, which allow you to dynamically limit which users can access a row, column, or record in a view:
current_user()
: Returns the current user’s email address.is_account_group_member()
: ReturnsTRUE
if the current user is a member of a specific account-level group. Recommended for use in dynamic views against Unity Catalog data.is_member()
: ReturnsTRUE
if the current user is a member of a specific workspace-level group. This function is provided for compatibility with the existing Hive metastore. Avoid using it with views against Unity Catalog data, because it does not evaluate account-level group membership.
The following examples illustrate how to create dynamic views in Unity Catalog.
Column-level permissions
With a dynamic view, you can limit the columns a specific user or group can access. In the following example, only members of the auditors
group can access email addresses from the sales_raw
table. During query analysis, Apache Spark replaces the CASE
statement with either the literal string REDACTED
or the actual contents of the email address column. Other columns are returned as normal. This strategy has no negative impact on the query performance.
-- Alias the field 'email' to itself (as 'email') to prevent the
-- permission logic from showing up directly in the column name results.
CREATE VIEW sales_redacted AS
SELECT
user_id,
CASE WHEN
is_account_group_member('auditors') THEN email
ELSE 'REDACTED'
END AS email,
country,
product,
total
FROM sales_raw
# Alias the field 'email' to itself (as 'email') to prevent the
# permission logic from showing up directly in the column name results.
spark.sql("CREATE VIEW sales_redacted AS "
"SELECT "
" user_id, "
" CASE WHEN "
" is_account_group_member('auditors') THEN email "
" ELSE 'REDACTED' "
" END AS email, "
" country, "
" product, "
" total "
"FROM sales_raw")
library(SparkR)
# Alias the field 'email' to itself (as 'email') to prevent the
# permission logic from showing up directly in the column name results.
sql(paste("CREATE VIEW sales_redacted AS ",
"SELECT ",
" user_id, ",
" CASE WHEN ",
" is_account_group_member('auditors') THEN email ",
" ELSE 'REDACTED' ",
" END AS email, ",
" country, ",
" product, ",
" total ",
"FROM sales_raw",
sep = ""))
// Alias the field 'email' to itself (as 'email') to prevent the
// permission logic from showing up directly in the column name results.
spark.sql("CREATE VIEW sales_redacted AS " +
"SELECT " +
" user_id, " +
" CASE WHEN " +
" is_account_group_member('auditors') THEN email " +
" ELSE 'REDACTED' " +
" END AS email, " +
" country, " +
" product, " +
" total " +
"FROM sales_raw")
Row-level permissions
With a dynamic view, you can specify permissions down to the row or field level. In the following example, only members of the managers
group can view transaction amounts when they exceed $1,000,000. Matching results are filtered out for other users.
CREATE VIEW sales_redacted AS
SELECT
user_id,
country,
product,
total
FROM sales_raw
WHERE
CASE
WHEN is_account_group_member('managers') THEN TRUE
ELSE total <= 1000000
END;
spark.sql("CREATE VIEW sales_redacted AS "
"SELECT "
" user_id, "
" country, "
" product, "
" total "
"FROM sales_raw "
"WHERE "
"CASE "
" WHEN is_account_group_member('managers') THEN TRUE "
" ELSE total <= 1000000 "
"END")
library(SparkR)
sql(paste("CREATE VIEW sales_redacted AS ",
"SELECT ",
" user_id, ",
" country, ",
" product, ",
" total ",
"FROM sales_raw ",
"WHERE ",
"CASE ",
" WHEN is_account_group_member('managers') THEN TRUE ",
" ELSE total <= 1000000 ",
"END",
sep = ""))
spark.sql("CREATE VIEW sales_redacted AS " +
"SELECT " +
" user_id, " +
" country, " +
" product, " +
" total " +
"FROM sales_raw " +
"WHERE " +
"CASE " +
" WHEN is_account_group_member('managers') THEN TRUE " +
" ELSE total <= 1000000 " +
"END")
Data masking
Because views in Unity Catalog use Spark SQL, you can implement advanced data masking by using more complex SQL expressions and regular expressions. In the following example, all users can analyze email domains, but only members of the auditors
group can view a user’s entire email address.
-- The regexp_extract function takes an email address such as
-- user.x.lastname@example.com and extracts 'example', allowing
-- analysts to query the domain name.
CREATE VIEW sales_redacted AS
SELECT
user_id,
region,
CASE
WHEN is_account_group_member('auditors') THEN email
ELSE regexp_extract(email, '^.*@(.*)$', 1)
END
FROM sales_raw
# The regexp_extract function takes an email address such as
# user.x.lastname@example.com and extracts 'example', allowing
# analysts to query the domain name.
spark.sql("CREATE VIEW sales_redacted AS "
"SELECT "
" user_id, "
" region, "
" CASE "
" WHEN is_account_group_member('auditors') THEN email "
" ELSE regexp_extract(email, '^.*@(.*)$', 1) "
" END "
" FROM sales_raw")
library(SparkR)
# The regexp_extract function takes an email address such as
# user.x.lastname@example.com and extracts 'example', allowing
# analysts to query the domain name.
sql(paste("CREATE VIEW sales_redacted AS ",
"SELECT ",
" user_id, ",
" region, ",
" CASE ",
" WHEN is_account_group_member('auditors') THEN email ",
" ELSE regexp_extract(email, '^.*@(.*)$', 1) ",
" END ",
" FROM sales_raw",
sep = ""))
// The regexp_extract function takes an email address such as
// user.x.lastname@example.com and extracts 'example', allowing
// analysts to query the domain name.
spark.sql("CREATE VIEW sales_redacted AS " +
"SELECT " +
" user_id, " +
" region, " +
" CASE " +
" WHEN is_account_group_member('auditors') THEN email " +
" ELSE regexp_extract(email, '^.*@(.*)$', 1) " +
" END " +
" FROM sales_raw")