Tutorial: Unity Catalog metastore admin tasks for Databricks SQL
This tutorial guides Databricks metastore admins through all the steps necessary to configure catalogs, databases, and tables with Unity Catalog. You use the Catalog Explorer UI to complete all steps.
As a Unity Catalog metastore admin, you can configure access to all securable objects associated with a metastore in Unity Catalog. Databricks account administrators can assign the metastore admin role in the account console. While many metastore admins may also be Databricks account administrators or Databricks workspace administrators, the default permissions for these roles vary.
This article demonstrates the following:
Requirements
In order to successfully complete this demo, you will need:
A Databricks workspace with Unity Catalog enabled
Admin privileges on a metastore defined in the Databricks account console
Access to Databricks SQL in a workspace associated with the metastore
One or more users or groups defined in Unity Catalog
Metastore admin privileges enable you to work with all data objects in the metastore. By working through this tutorial with another Databricks SQL user without metastore admin privileges, you can view how changing grants immediately updates access to various actions for different assets.
Step 2: Create a new catalog in the metastore
You can use the Catalog Explorer UI to add catalogs. To use SQL, see CREATE CATALOG.
The right side of the Catalog Explorer contains links to all assets in the hierarchy of your currently viewed data object in the pattern:
Catalogs > catalog_name > database_name >
. Click Catalogs to return to the top level of the metastore hierarchy.Click Create catalog in the top right of the screen.
Enter a unique catalog name.
Enter an optional comment.
Click Create.
Newly created catalogs contain two schemas (databases): default
and information_schema
. For more information on this system table, see Information schema.
Step 3: Create a new schema in the catalog
You can use the Catalog Explorer UI to add databases. To use SQL, see CREATE SCHEMA.
Click the name of the catalog from the list on the left side of the screen.
Click Create database in the top right of the screen.
Enter a unique database name.
Enter an optional comment.
Click Create.
Step 4: Create a table from a CSV
Databricks provides a UI that allows you to upload a CSV file into a Unity Catalog table. You can use any CSV under 100 megabytes. Your SQL warehouse must be active in order to use the UI. For more details, see Load data using the add data UI.
If you prefer to create a table using SQL, see CREATE TABLE and skip this step.
Click Add > Add data, and then click Create or modify a table.
Add a CSV file from your computer by browsing or dropping the file in the interactive box.
It may take a few seconds for the file to upload.
Use the first dropdown menu to select the catalog you created in step 2.
Use the second dropdown menu to select the database you created in step 3.
You can provide a custom name for the table.
If the table has a header row, make sure the First row contains the header box is checked. Otherwise, uncheck this box.
Optionally, provide custom names for each column.
If the data from your file is not previewing correctly, click Advanced attributes and adjust the available settings.
Click Create.
Step 5: Grant permissions for catalog, database, and table
The minimum set of permissions required for Databricks SQL end users to complete their tasks varies based on the desired outcome. You can assign permissions to either individual users or groups. By working through this tutorial with another Databricks SQL user, you can view how changing grants immediately updates access to various actions for different assets.
Step 5a: Grant USE CATALOG
permissions for catalog
Users who want to perform actions in a catalog must have USE CATALOG
permissions.
Click
Catalog in the sidebar to return to the Catalog Explorer.
Click the name of the catalog you created in step 2.
Click the Permissions tab to see the grants for the catalog.
Click Grant.
Use the dropdown to select the user or group you wish to add.
Click the box next to the
USE CATALOG
privilege.Click Grant.
Step 5b: Grant USE SCHEMA
permissions for database
USE SCHEMA
is the minimum privilege that users need in order to view a database. This allows users to interact with tables or views contained within the database. Users must have CREATE TABLE
privileges to define new tables or views from queries or UIs.
Click the name of the database you created in step 3.
Click the Permissions tab to see the grants for the database.
Click Grant.
Use the dropdown to select the user or group you wish to add.
Click the box next to the
USE SCHEMA
privilege.Click Grant.
Step 5c: Grant SELECT
permissions for table
Users can only see tables and views they have permissions on. SELECT
is the minimum privilege required to see and query a table or view. Users with MODIFY
privileges can add, delete, and modify data, and can also drop tables entirely.
Click the name of the table you created in step 4.
Click the Permissions tab to see the grants for the table.
Click Grant.
Use the dropdown to select the user or group you wish to add.
Click the box next to the
SELECT
privilege.Click Grant.
Next steps
Secure access to production data should follow a principle of least privilege. In this tutorial, you configured a catalog, database, and table for an end user to openly query production data, but denied them the ability to accidentally corrupt or delete that data. Most end users should have this configuration for most data they have access to, as most queries are read-only queries.
Users may also need access to save derivative datasets to a database. Databricks recommends creating a separate database for derivative datasets to isolate these tables from production tables. To test this, create a new database and grant the CREATE TABLE
and CREATE VIEW
privileges to the same user. The user can now save the results of a query on the table you created in step 4 to a newly created table or view.
If groups of users need to work on tables together, grant MODIFY
permissions on specific tables.
Databricks recommends managing permission grants to groups rather than individual users.