Skip to main content

Create clean rooms

This page describes how to create a Databricks clean room, a secure environment for collaborative data analysis.

Key features and limitations:

  • Secure collaboration: Clean rooms enable multiple parties to work together on sensitive enterprise data without direct access to each other's raw data.
  • Collaborator capacity: A clean room can have ten parties, including the creator and up to nine other collaborators.
  • Metastore limit: You can create up to five clean rooms per metastore.

Before you begin

The privileges needed to use clean rooms vary depending on the task:

Task

Required privileges

View a clean room

Must be the owner of the clean room, or have one of the following privileges on the clean room: MANAGE, MODIFY CLEAN ROOM, EXECUTE CLEAN ROOM TASK, or BROWSE.

Update the owner of a clean room

Must be the owner of the clean room, or have the MANAGE privilege on the clean room.

Add or remove data assets in a clean room

Must be the owner of the clean room or have the MODIFY CLEAN ROOM privilege on the clean room. If you are not the owner of the clean room, you and the clean room owner must have SELECT on any table or view and READ VOLUME on any volume that you add, along with USE CATALOG and USE SCHEMA on the parent catalog and schema.

Add or remove notebooks in a clean room

For the notebook's uploader:

  • If they are the designated runner of the notebook, they must have the EXECUTE CLEAN ROOM TASK or MODIFY CLEAN ROOM privileges.
  • If the notebook is runnable by a collaborator, the uploader must have the MODIFY CLEAN ROOM privilege.

Update a comment in a clean room

Must be the owner of the clean room or have the MODIFY CLEAN ROOM privilege on the clean room.

Grant access to a clean room

Must be the owner, or have the MANAGE privilege on the clean room.

Delete a clean room

Must be the owner or have the MANAGE privilege on the clean room.

Beyond the task-specific privileges, the following is true:

  • When a clean room is shared, your collaborator's organization's metastore admin is automatically assigned ownership. The metastore admin can reassign ownership to a non-metastore admin. As a best practice for data governance, Databricks recommends that ownership be assigned to a group.
  • If your workspace does not have a metastore admin assigned, you must assign the role. See Assign a metastore admin and Manage Unity Catalog object ownership.
  • The clean room owner must keep these privileges throughout the life of the clean room.

To learn about permission requirements for updating clean rooms and running tasks (notebooks) in clean rooms, see Manage clean rooms and Run notebooks in clean rooms.

note

The central clean room can have a maximum of two other non-AWS regions among its collaborators.

Step 1. Request the collaborator's sharing identifier

Before you can create a clean room, you must have the Clean Room sharing identifier of the organizations you will be collaborating with. The sharing identifier is a string that consists of the organization's global metastore ID + workspace ID + the contact's username (email address). Your collaborators can be in any cloud or region.

Reach out to your collaborators to request their sharing identifier. They can get the sharing identifier using the instructions in Find your sharing identifier.

Step 2. Create a clean room

To create a clean room, you must use Catalog Explorer.

  1. In your Databricks workspace, click Catalog icon Catalog.

  2. On the Quick access page, click the Clean Rooms > button.

  3. Click Create Clean Room.

  4. On the Create Clean Room page, enter a user-friendly name for the clean room.

    The name cannot use spaces, periods, or forward slashes (/).

    Once it's saved, the clean room name cannot be changed. Use a name that potential collaborators will find useful and descriptive.

  5. Select the cloud provider and region where the central clean room will be created.

    The cloud provider must match your current workspace, but the region can be different. Consider your organization's data residency or other policies when you make your selection.

  6. Each clean room can have up to ten collaborators. Enter the Clean Room sharing identifier for each collaborator. See Step 1. Request the collaborator's sharing identifier.

    Create a clean room.

    You can test your clean room before full deployment by using either your sharing identifier or the identifier of another user in your current metastore. Doing so creates two clean rooms in your current metastore. For example, if you create a clean room titled test_clean_room, a second clean room named test_clean_room_collaborator also appears. Running notebooks with a collaborator in the same metastore functions the same as with an external collaborator. See Run notebooks in clean rooms.

  7. Make a note of the catalog names assigned to you and your collaborators.

    All data assets added to the clean room will appear under that catalog in the central clean room, and can be referenced using that catalog in the Unity Catalog three-level namespace (<catalog>.<schema>.<table-etc>).

  8. Select the network access policy type. This cannot be changed after the clean room is created.

    Network policy type.

    note

    Restricted access can delay asset availability for up to ten minutes and does not support Google Cloud collaborators.

    After you create the clean room, you can view the network access policy in the Security tab.

  9. Click Create Clean Room.

If your current workspace is set to the HIPAA compliance security profile, then when you create a clean room, that setting is applied to the central clean room. Collaborators must access the clean room from a workspace with the same security profile. See Compliance security profile.

Step 3. Add data assets and notebooks to the clean room

Both the creator and the collaborators can add tables, volumes, views, and notebooks to the clean room.

note

The following instructions assume you are returning to an already-created clean room to add assets. If you just created a clean room for the first time, a wizard walks you through adding data assets and notebooks. The actual UI for adding these assets is the same, regardless of whether you are guided by the wizard or not.

To add notebooks:

  1. Click the + Add notebooks button and browse for the notebook you want to add.

  2. Name the notebook.

  3. Select which collaborator can run the notebook. Select You to run the notebook yourself.

    Self-designated-runner.

    You can optionally give the notebook an alternative Notebook name.

    Notebooks that you share in clean rooms query data and run data analysis workloads on the tables, views, and volumes that you and any other collaborators have added to the clean room.

    If you share a notebook that includes results, those results will be shared with your collaborators.

    You can use a notebook to create output tables that are temporarily shared to your collaborator's metastore when they run the notebook. See Create and work with output tables in Databricks Clean Rooms.

    To use a test dataset, download our sample notebook.

    important

    Any notebook references to tables, views, or volumes that were added to the clean room must use the catalog name assigned when the clean room was created (“creator” for data assets added by the clean room creator, and “collaborator” for data assets added by the invited collaborator). For example, a table added by the creator could be named creator.sales.california.

    Likewise, verify that the notebook uses any aliases assigned to that were data assets in the clean room.

To add assets:

  1. In your Databricks workspace, click Catalog icon Catalog.

  2. On the Quick access page, click the Clean Rooms > button.

  3. Find and click the name of the clean room you want to update.

  4. Click + Add data assets to add tables, volumes, or views.

  5. Select the data assets you want to share and click Add data assets.

    When you share a table, volume, or view, you can optionally add an alias. The alias name will be the only name visible in the clean room.

    When you share a table, you can optionally add partition clauses that enable you to share only part of the table. For details about how to use partitions to limit what you share, see Specify table partitions to share.

    note

    If you're using default storage, you can't share table partitions. See Default storage in serverless workspaces.

note

To participate in the Private Preview for federated table sharing, contact your Databricks account representative. See What is Lakehouse Federation?.