Create clean rooms

Preview

This feature is in Public Preview.

This article describes how to create a clean room, a secure and privacy-protecting environment where multiple parties can work together on sensitive enterprise data without direct access to each other’s data.

Before you begin

The privileges needed to use clean rooms vary depending on the task:

  • To create a clean room, you must have the CREATE CLEAN ROOM privilege or be a metastore admin. The creator is automatically assigned as the owner of the clean room in their Unity Catalog metastore.

  • To initiate participation in a clean room that is shared with you, you must be a metastore admin.

    When a clean room is shared, the collaborator organization’s metastore admin is automatically assigned ownership of the clean room. The metastore admin can reassign ownership to a non-metastore admin. As a best practice for data governance, Databricks recommends that ownership be assigned to a group.

    If your workspace does not have a metastore admin assigned, you must assign the role. See Assign a metastore admin and Manage Unity Catalog object ownership.

  • To add and remove data assets and notebooks in a clean room you must be the owner of the clean room or have the MODIFY CLEAN ROOM privilege on the clean room. Additionally, you and the owner of the clean room (if you are not the owner) must have SELECT on tables and views that you add and READ VOLUME on volumes that you add.

To learn about permission requirements for updating clean rooms and running tasks (notebooks) in clean rooms, see Manage clean rooms and Run notebooks in clean rooms.

You can create up to five clean rooms per metastore.

Step 1. Request the collaborator’s sharing identifier

Before you can create a clean room, you must have the Clean Room sharing identifier of the organization that you will be collaborating with. The sharing identifier is a string that consists of the organization’s global metastore ID + workspace ID + the contact’s username (email address). The collaborator can be in any cloud or region.

Reach out to the collaborator to request their sharing identifier.

The collaborator can get the sharing identifier using the instructions in Find your sharing identifier.

Step 2. Create a clean room

To create a clean room, you must use Catalog Explorer.

  1. In your Databricks workspace, click Catalog icon Catalog.

  2. On the Quick access page, click the Clean Rooms > button.

    Alternatively, click the Gear icon gear icon at the top of the Catalog pane and select Clean Rooms.

  3. Click Create Clean Room.

  4. On the Create Clean Room page, enter a user-friendly name for the clean room.

    The name cannot use spaces, periods, or forward slashes (/).

    You cannot change the clean room name once it’s saved. Use a name that the collaborator will find useful and descriptive.

  5. Select the cloud provider and region where the central clean room will be created.

    The cloud provider must be the same as your current workspace, but the region does not. Consider your organization’s data residency or other policies when you make your selection.

  6. (Optional) Add a comment.

  7. Enter the collaborator’s Clean Room sharing identifier.

    See Step 1. Request the collaborator’s sharing identifier.

    You can test your clean room before full deployment by using either your sharing identifier or the identifier of another user in your current metastore. Doing so creates two clean rooms in your current metastore. For example, if you create a clean room titled test_clean_room, a second clean room named test_clean_room_collaborator also appears. Running notebooks with a collaborator in the same metastore functions the same as with an external collaborator. See Run notebooks in clean rooms.

  8. Make note of the catalog names assigned to you (the creator) and the collaborator.

    All data assets added to the clean room will appear under that catalog in the central clean room, and can be referenced using that catalog in the Unity Catalog three-level namespace (<catalog>.<schema>.<table-etc>).

  9. Select the network access policy type. This cannot be changed after the clean room is created.

    Note

    Restricted access can delay asset availability for up to ten minutes and does not support Google Cloud collaborators.

    After you create the clean room, you can view the network access policy in the Security tab.

  10. Click Create Clean Room.

Step 3. Add data assets and notebooks to the clean room

Either party in the clean room (the creator and the collaborator) can add tables, volumes, views, and notebooks to the clean room.

Permissions required:

  • You must be the owner or have the MODIFY CLEAN ROOM privilege on the clean room.

  • You and the clean room owner (if you are not the owner) must have SELECT on any table or view and READ VOLUME on any volume that you add, along with USE CATALOG and USE SCHEMA on the parent catalog and schema.

    The clean room owner must keep these privileges throughout the life of the clean room.

Note

The following instructions assume you are returning to an already-created clean room to add assets. If you just created a clean room for the first time, a wizard walks you through adding data assets and notebooks. The actual UI for adding these assets is the same, regardless of whether you are guided by the wizard or not.

To add assets:

  1. In your Databricks workspace, click Catalog icon Catalog.

  2. On the Quick access page, click the Clean Rooms > button.

    Alternatively, click the Gear icon gear icon at the top of the Catalog pane and select Clean Rooms.

  3. Find and click the name of the clean room you want to update.

  4. Click + Add data assets to add tables, volumes, or views.

  5. Select the data assets you want to share and click Add data assets.

    When you share a table, volume, or view, you can optionally add an alias. The alias name will be the only name visible in the clean room.

    When you share a table, you can optionally add partition clauses that enable you to share only part of the table. For details about how to use partitions to limit what you share, see Specify table partitions to share.

  6. To add notebooks, click the + Add notebooks button and browse for the notebook you want to add.

    You can optionally give the notebook an alternative Notebook name.

    Notebooks that you share in clean rooms query data and run data analysis workloads on the tables, views, and volumes that you and the other collaborator have added to the clean room.

    Notebooks operate on the principle of implicit approval: you cannot run notebooks you create. You create the notebooks that your collaborator uses, and your collaborator creates the notebooks that you use.

    If you share a notebook that includes results, those results will be shared with your collaborator.

    You can use a notebook to create output tables that are temporarily shared to your collaborator’s metastore when they run the notebook. See Create and work with output tables in Databricks Clean Rooms.

    Important

    Any notebook references to tables, views, or volumes that were added to the clean room must use the catalog name assigned when the clean room was created (“creator” for data assets added by the clean room creator, and “collaborator” for data assets added by the invited collaborator). For example, a table added by the creator could be named creator.sales.california.

    Likewise, verify that the notebook uses any aliases assigned to that were data assets in the clean room.