HIPAA-Compliant Deployment

Databricks supports HIPAA-compliant deployment to process PHI data. In this deployment mode, all PHI data will be encrypted at rest and through transit. Follow the steps in this article to ensure your deployment is HIPAA compliant.

Sign a Business Associate Agreement (BAA) with AWS

Contact your account manager or email sales@databricks.com and sign a Business Associate Agreement (BAA) with AWS to maintain compliance with HIPAA regulations. This agreement ensures that all the data that flows through our services meets the HIPAA regulations.

Create and verify a HIPAA-compliant cluster

These steps describe how to create a HIPAA-compliant cluster to process PHI data.

Step 1: Create a cluster

Follow the instructions in Create a Cluster. As part of the configuration step you must choose a Databricks runtime.

Warning

Databricks Runtime for Machine Learning includes high-performance distributed machine learning packages that use MPI (Message Passing Interface) and other low-level communication protocols. Because these protocols do not natively support encryption over the wire, these ML packages can potentially send unencrypted sensitive data across the network. These packages do not change data encryption over the wire if your workflow does not depend on them.

What are the risks?

Messages sent across the network by these ML packages are typically either ML model parameters or summary statistics about training data. It is therefore not typically expected that sensitive data, such as protected health information, would be sent over the wire unencrypted. However, it is possible that certain configurations or uses of these packages (such as specific model designs) could result in messages being sent across the network that contain such information.

Which packages are affected?

Step 2: Configure the cluster with an EBS volume

Provision an EBS volume, as Databricks EBS volumes are encrypted while the default local storage is not.

no-alternative-text

Step 3: Verify that encryption is enabled

  1. Create a notebook in the workspace and attach the notebook to the cluster that was created in the previous step.

  2. Run the following command in the notebook:

    %scala spark.conf.get("spark.ssl.enabled")
    

    If the returned value is true, you have successfully created a cluster with encryption turned on. If not, contact help@databricks.com.