What is Databricks Clean Rooms?
Preview
This feature is in Public Preview. To request access, reach out to your Databricks representative.
This article introduces Clean Rooms, a Databricks feature that uses Delta Sharing and serverless compute to provide a secure and privacy-protecting environment where multiple parties can work together on sensitive enterprise data without direct access to each other’s data.
Requirements
To be eligible to use clean rooms, you must:
Sign up and be approved for the public preview. Contact your Databricks account team to request access.
Have an account that is enabled for serverless compute. See Enable serverless compute.
Have a workspace that is enabled for Unity Catalog. See Enable a workspace for Unity Catalog.
How does Clean Rooms work?
When you create a clean room, you create the following:
A securable clean room object in your Unity Catalog metastore.
The “central” clean room, which is an isolated ephemeral environment managed by Databricks.
A securable clean room object in your collaborator’s Unity Catalog metastore.
Tables, volumes (non-tabular data), and notebooks that either collaborator shares in the clean room are shared, using Delta Sharing, with the central clean room only.
Collaborators cannot see the data in other collaborators’ tables and volumes, but they can see column names and column types, and they can run approved notebook code that operates over the tables and volumes. The notebook code runs in the central clean room.
How does Clean Rooms ensure a no-trust environment?
The Databricks Clean Rooms model is “no-trust.” All collaborators in a no-trust clean room have equal privileges, including the creator of the clean room. Clean Rooms is designed to prevent the running of unauthorized code and the unauthorized sharing of data. For example, all collaborators must approve a notebook before it can be run. This trust is enforced implicitly by preventing a collaborator from running any notebook that they have created themselves: you can only run a notebook created by the other collaborator.
Additional safeguards or restrictions
The following safeguards are in place in addition to the implicit notebook approval process mentioned above:
After a clean room is created, it is locked down to prevent new collaborators from joining the clean room.
If any collaborator deletes the clean room, the central clean room is void and no clean room tasks can be run by any user.
During the public preview, each clean room is limited to two collaborators.
You cannot rename the clean room.
The clean room name must be unique in every collaborator’s metastore, so that all collaborators can refer to the same clean room unambiguously.
Comments on the clean room securable in each collaborator’s workspace are not propagated to other collaborators.
Limitations
During the public preview, the following limitations apply:
No support for turning the internet off in clean rooms to prevent malicious code from exfiltrating data to an external location.
No service credential Scala libraries included in the required Databricks Runtime version.
Resource quotas
Databricks enforces resource quotas on all Clean Room securable objects. These quotas are listed in Resource limits. If you expect to exceed these resource limits, contact your Databricks account team.
You can monitor your quota usage using the Unity Catalog resource quotas APIs. See Monitor your usage of Unity Catalog resource quotas.