This article shows how to create a metastore in Unity Catalog and link it to workspaces. A metastore is the top-level container of objects in Unity Catalog. It stores metadata about data assets (tables and views) and the permissions that govern access to them. You must create one metastore for each region in which your organization operates.
In addition to the approaches described in this article, you can also create a metastore by using the Databricks Terraform provider, specifically the databricks_metastore resource. To enable Unity Catalog to access the metastore, use databricks_metastore_data_access. To link workspaces to a metastore, use databricks_metastore_assignment.
To create a metastore:
You must be a Databricks account admin.
Your Databricks account must be on the Premium plan or above.
In AWS, you must have the ability to create S3 buckets, IAM roles, IAM policies, and cross-account trust relationships.
To create a Unity Catalog metastore:
Configure a storage bucket and IAM role in AWS.
This bucket will store all of the metastore’s managed tables, except those that are in a catalog or schema with their own managed storage location.
When you create the bucket:
Create it in the same region as the workspaces you will to use to access the data.
Use a dedicated S3 bucket for each metastore that you create.
Do not allow direct user access to the bucket.
For instructions, see Configure a storage bucket and IAM role in AWS.
Log in to the Databricks account console.
Click Create Metastore.
Enter a name for the metastore.
Enter the region where the metastore will be deployed.
This must be the same region as the workspaces you want to use to access the data. Make sure that this matches the region of the cloud storage bucket you created earlier.
Enter the S3 bucket path (you can omit
s3://) and IAM role name that you created in step 1.
When prompted, select workspaces to link to the metastore.
For more information about linking workspaces to metastores, see Enable a workspace for Unity Catalog.
The user who creates a metastore is its original metastore admin. Databricks recommends that you reassign the original metastore admin to a group. See (Recommended) Transfer ownership of your metastore to a group.
Databricks uses cross-origin resource sharing (CORS) to upload data to personal staging locations in Unity Catalog. See Configure Unity Catalog storage account for CORS.
If you are closing your Databricks account or have another reason to delete access to data managed by your Unity Catalog metastore, you can delete the metastore.
All objects managed by the metastore will become inaccessible using Databricks workspaces. This action cannot be undone.
Managed table data and metadata will be auto-deleted after 30 days. External table data in your cloud storage is not affected by metastore deletion.
To delete a metastore:
As a metastore admin, log in to the account console.
Click the metastore name.
On the Configuration tab, click the three-button menu at the far upper right and select Delete.
On the confirmation dialog, enter the name of the metastore and click Delete.