Create views

Preview

Unity Catalog is in Public Preview. To participate in the preview, contact your Databricks representative.

This article shows how to create views in Unity Catalog. A view is a read-only object that joins records from multiple tables and views.

Requirements

  • In Databricks, you must be an account admin.

  • Your Databricks account must be on the Premium plan.

  • If necessary, create a metastore.

  • The metastore must be linked to the workspace where you run the commands to create the catalog or schema.

  • You can create a catalog or schema from a Data Science & Engineering notebook or a Databricks SQL endpoint that uses a compute resource that is compliant with Unity Catalog’s security requirements.

  • If necessary, create a catalog and schema in the metastore. The catalog and schema will contain the new table. To create a schema, you must have the USAGE and CREATE data permissions on the schema’s parent catalog. See Create and manage catalogs and Create and manage schemas (databases).

  • If necessary, create tables that will compose the view.

  • You can create a view from a Data Science & Engineering notebook or a Databricks SQL endpoint that uses a compute resource that is compliant with Unity Catalog’s security requirements.

  • The parent catalog and schema must exist. When you create a metastore, it contains a catalog called main with an empty schema called default. See Create and manage catalogs and Create and manage schemas (databases).

  • You must have the USAGE permission on the parent catalog and the USAGE and CREATE permissions on the parent schema.

  • You must have SELECT access on all tables and views referenced in the view.

    Note

    To read from a view from a cluster with Single User cluster security mode, you must have SELECT on all referenced tables and views.

Create a view

To create a view, run the following SQL command. Items in brackets are optional. Replace the placeholder values:

  • <catalog_name>: The name of the catalog.

  • <schema_name>: The name of the schema.

  • <view_name>: A name for the view.

  • <query>: The query, columns, and tables and views used to compose the view.

CREATE VIEW <catalog_name>.<schema_name>.<view_name> AS
SELECT <query>;
spark.sql("CREATE VIEW <catalog_name>.<schema_name>.<view_name> AS "
  "SELECT <query>")
library(SparkR)

sql(paste("CREATE VIEW <catalog_name>.<schema_name>.<view_name> AS ",
  "SELECT <query>",
  sep = ""))
spark.sql("CREATE VIEW <catalog_name>.<schema_name>.<view_name> AS " +
  "SELECT <query>")

For example, to create a view named sales_redacted from columns in the sales_raw table:

CREATE VIEW sales_metastore.sales.sales_redacted AS
SELECT
  user_id,
  email,
  country,
  product,
  total
FROM sales_metastore.sales.sales_raw;
spark.sql("CREATE VIEW sales_metastore.sales.sales_redacted AS "
  "SELECT "
  "  user_id, "
  "  email, "
  "  country, "
  "  product, "
  "  total "
  "FROM sales_metastore.sales.sales_raw")
library(SparkR)

sql(paste("CREATE VIEW sales_metastore.sales.sales_redacted AS ",
  "SELECT ",
  "  user_id, ",
  "  email, ",
  "  country, ",
  "  product, ",
  "  total ",
  "FROM sales_metastore.sales.sales_raw",
  sep = ""))
spark.sql("CREATE VIEW sales_metastore.sales.sales_redacted AS " +
  "SELECT " +
  "  user_id, " +
  "  email, " +
  "  country, " +
  "  product, " +
  "  total " +
  "FROM sales_metastore.sales.sales_raw")

You can also create a view by using the Databricks Terraform provider and databricks_table. You can retrieve a list of view full names by using databricks_views.

Create a dynamic view

In Unity Catalog, you can use dynamic views to configure fine-grained access control, including:

  • Security at the level of columns or rows.

  • Data masking.

Note

Fine-grained access control using dynamic views are not available on clusters with Single User security mode.

Unity Catalog introduces the following functions, which allow you to dynamically limit which users can access a row, column, or record in a view:

  • current_user(): Returns the current user’s email address.

  • is_account_group_member(): Returns TRUE if the current user is a member of a specific account-level group. Recommended for use in dynamic views against Unity Catalog data.

  • is_member(): Returns TRUE if the current user is a member of a specific workspace-level group. This function is provided for compatibility with the existing Hive metastore. Avoid using it with views against Unity Catalog data, because it does not evaluate account-level group membership.

The following examples illustrate how to create dynamic views in Unity Catalog.

Column-level permissions

With a dynamic view, you can limit the columns a specific user or group can access. In the following example, only members of the auditors group can access email addresses from the sales_raw table. During query analysis, Apache Spark replaces the CASE statement with either the literal string REDACTED or the actual contents of the email address column. Other columns are returned as normal. This strategy has no negative impact on the query performance.

-- Alias the field 'email' to itself (as 'email') to prevent the
-- permission logic from showing up directly in the column name results.
CREATE VIEW sales_redacted AS
SELECT
  user_id,
  CASE WHEN
    is_account_group_member('auditors') THEN email
    ELSE 'REDACTED'
  END AS email,
  country,
  product,
  total
FROM sales_raw
# Alias the field 'email' to itself (as 'email') to prevent the
# permission logic from showing up directly in the column name results.
spark.sql("CREATE VIEW sales_redacted AS "
  "SELECT "
  "  user_id, "
  "  CASE WHEN "
  "    is_account_group_member('auditors') THEN email "
  "  ELSE 'REDACTED' "
  "  END AS email, "
  "  country, "
  "  product, "
  "  total "
  "FROM sales_raw")
library(SparkR)

# Alias the field 'email' to itself (as 'email') to prevent the
# permission logic from showing up directly in the column name results.
sql(paste("CREATE VIEW sales_redacted AS ",
  "SELECT ",
  "  user_id, ",
  "  CASE WHEN ",
  "    is_account_group_member('auditors') THEN email ",
  "  ELSE 'REDACTED' ",
  "  END AS email, ",
  "  country, ",
  "  product, ",
  "  total ",
  "FROM sales_raw",
  sep = ""))
// Alias the field 'email' to itself (as 'email') to prevent the
// permission logic from showing up directly in the column name results.
spark.sql("CREATE VIEW sales_redacted AS " +
  "SELECT " +
  "  user_id, " +
  "  CASE WHEN " +
  "    is_account_group_member('auditors') THEN email " +
  "  ELSE 'REDACTED' " +
  "  END AS email, " +
  "  country, " +
  "  product, " +
  "  total " +
  "FROM sales_raw")

Row-level permissions

With a dynamic view, you can specify permissions down to the row or field level. In the following example, only members of the managers group can view transaction amounts when they exceed $1,000,000. Matching results are filtered out for other users.

 CREATE VIEW sales_redacted AS
 SELECT
   user_id,
   country,
   product,
   total
 FROM sales_raw
 WHERE
   CASE
     WHEN is_account_group_member('managers') THEN TRUE
     ELSE total <= 1000000
   END;
 spark.sql("CREATE VIEW sales_redacted AS "
   "SELECT "
   "  user_id, "
   "  country, "
   "  product, "
   "  total "
   "FROM sales_raw "
   "WHERE "
   "CASE "
   "  WHEN is_account_group_member('managers') THEN TRUE "
   "  ELSE total <= 1000000 "
   "END")
 library(SparkR)

 sql(paste("CREATE VIEW sales_redacted AS ",
   "SELECT ",
   "  user_id, ",
   "  country, ",
   "  product, ",
   "  total ",
   "FROM sales_raw ",
   "WHERE ",
   "CASE ",
   "  WHEN is_account_group_member('managers') THEN TRUE ",
   "  ELSE total <= 1000000 ",
   "END",
   sep = ""))
 spark.sql("CREATE VIEW sales_redacted AS " +
   "SELECT " +
   "  user_id, " +
   "  country, " +
   "  product, " +
   "  total " +
   "FROM sales_raw " +
   "WHERE " +
   "CASE " +
   "  WHEN is_account_group_member('managers') THEN TRUE " +
   "  ELSE total <= 1000000 " +
   "END")

Data masking

Because views in Unity Catalog use Spark SQL, you can implement advanced data masking by using more complex SQL expressions and regular expressions. In the following example, all users can analyze email domains, but only members of the auditors group can view a user’s entire email address.

-- The regexp_extract function takes an email address such as
-- user.x.lastname@example.com and extracts 'example', allowing
-- analysts to query the domain name.

CREATE VIEW sales_redacted AS
SELECT
  user_id,
  region,
  CASE
    WHEN is_account_group_member('auditors') THEN email
    ELSE regexp_extract(email, '^.*@(.*)$', 1)
  END
  FROM sales_raw
# The regexp_extract function takes an email address such as
# user.x.lastname@example.com and extracts 'example', allowing
# analysts to query the domain name.

spark.sql("CREATE VIEW sales_redacted AS "
  "SELECT "
  "  user_id, "
  "  region, "
  "  CASE "
  "    WHEN is_account_group_member('auditors') THEN email "
  "    ELSE regexp_extract(email, '^.*@(.*)$', 1) "
  "  END "
  "  FROM sales_raw")
library(SparkR)

# The regexp_extract function takes an email address such as
# user.x.lastname@example.com and extracts 'example', allowing
# analysts to query the domain name.

sql(paste("CREATE VIEW sales_redacted AS ",
  "SELECT ",
  "  user_id, ",
  "  region, ",
  "  CASE ",
  "    WHEN is_account_group_member('auditors') THEN email ",
  "    ELSE regexp_extract(email, '^.*@(.*)$', 1) ",
  "  END ",
  "  FROM sales_raw",
  sep = ""))
// The regexp_extract function takes an email address such as
// user.x.lastname@example.com and extracts 'example', allowing
// analysts to query the domain name.

spark.sql("CREATE VIEW sales_redacted AS " +
  "SELECT " +
  "  user_id, " +
  "  region, " +
  "  CASE " +
  "    WHEN is_account_group_member('auditors') THEN email " +
  "    ELSE regexp_extract(email, '^.*@(.*)$', 1) " +
  "  END " +
  "  FROM sales_raw")