Databricks supports HIPAA-compliant deployment to process PHI data. In this deployment mode, all PHI data will be encrypted at rest and through transit. Follow the steps in this article to ensure your deployment is HIPAA compliant.
Contact your account manager or email email@example.com and sign a Business Associate Agreement (BAA) with AWS to maintain compliance with HIPAA regulations. This agreement ensures that all the data that flows through our services meets the HIPAA regulations.
These steps describe how to create a HIPAA-compliant cluster to process PHI data.
Databricks Runtime for Machine Learning includes high-performance distributed machine learning packages that use MPI (Message Passing Interface) and other low-level communication protocols. Because these protocols do not natively support encryption over the wire, these ML packages can potentially send unencrypted sensitive data across the network. These packages do not change data encryption over the wire if your workflow does not depend on them.
Messages sent across the network by these ML packages are typically either ML model parameters or summary statistics about training data. It is therefore not typically expected that sensitive data, such as protected health information, would be sent over the wire unencrypted. However, it is possible that certain configurations or uses of these packages (such as specific model designs) could result in messages being sent across the network that contain such information.
Provision an EBS volume, as Databricks EBS volumes are encrypted while the default local storage is not.
Create a notebook in the workspace and attach the notebook to the cluster that was created in the previous step.
Run the following command in the notebook:
If the returned value is true, you have successfully created a cluster with encryption turned on. If not, contact firstname.lastname@example.org.