HIPAA compliance features

Preview

The ability for admins to add Enhanced Security and Compliance features is a feature in Public Preview. The compliance security profile and support for compliance standards are generally available (GA).

HIPAA compliance features requires enabling the compliance security profile, which adds monitoring agents, enforces instance types for inter-node encryption, provides a hardened compute image, and other features. For technical details, see Compliance security profile. It is your responsibility to confirm that each workspace has the compliance security profile enabled.

To use the compliance security profile, your Databricks account must include the Enhanced Security and Compliance add-on. For details, see the pricing page.

This feature requires your workspace to be on the Enterprise pricing tier.

Ensure that sensitive information is never entered in customer-defined input fields, such as workspace names, cluster names, and job names.

Which compute resources get enhanced security

The compliance security profile enhancements apply to compute resources in the classic compute plane in all regions.

Serverless SQL warehouse support for the compliance security profile varies by region. See Serverless SQL warehouses support the compliance security profile in some regions.

HIPAA overview

The Health Insurance Portability and Accountability Act of 1996 (HIPAA), the Health Information Technology for Economic and Clinical Health (HITECH), and the regulations issued under HIPAA are a set of US healthcare laws. Among other provisions, these laws establish requirements for the use, disclosure, and safeguarding of protected health information (PHI).

HIPAA applies to covered entities and business associates that create, receive, maintain, transmit, or access PHI. When a covered entity or business associate engages the services of a cloud service provider (CSP), such as Databricks, the CSP becomes a business associate under HIPAA.

HIPAA regulations require that covered entities and their business associates enter into a contract called a Business Associate Agreement (BAA) to ensure the business associates will protect PHI adequately. Among other things, a BAA establishes the permitted and required uses and disclosures of PHI by the business associate, based on the relationship between the parties and the activities and services being performed by the business associate.

Does Databricks permit the processing of PHI data on Databricks?

Yes, if you enable the compliance security profile and add the HIPAA compliance standard as part of the compliance security profile configuration. Contact your Databricks account team for more information. It is your responsibility before you process PHI data to have a BAA agreement with Databricks.

Enable HIPAA on a workspace

This section assumes you are on the E2 version of the Databricks platform.

If you are an existing HIPAA customer and your account is not yet on the E2 version of the Databricks platform,

  • Note that the E2 platform is a multi-tenant platform and your choice to deploy HIPAA on E2 will be treated as a waiver of any provision in your contract that would be in conflict with our ability to provide you HIPAA on the E2 platform.

To configure your workspace to support processing of data regulated by the HIPAA compliance standard, the workspace must have the compliance security profile enabled. You can enable it and add the HIPAA compliance standard across all workspaces or only on some workspaces.

Important

  • You are wholly responsible for ensuring your own compliance with all applicable laws and regulations. Information provided in Databricks online documentation does not constitute legal advice, and you should consult your legal advisor for any questions regarding regulatory compliance.

  • Databricks does not support the use of preview features for the processing of PHI on the HIPAA on E2 platform, with the exception of the features listed in Preview features that are supported for processing of PHI data.

Shared responsibility of HIPAA compliance

Complying with HIPAA has three major areas, with different responsibilities. While each party has numerous responsibilities, below we enumerate key responsibilities of ours, along with your responsibilities.

This article use the Databricks terminology control plane and a compute plane, which are two main parts of how Databricks works:

  • The Databricks control plane includes the backend services that Databricks manages in its own AWS account.

  • The compute plane is where your data lake is processed. The classic compute plane includes an AWS VPC in your AWS account, and clusters of compute resources to process your notebooks, jobs, and pro or classic SQL warehouses.

    Important

    For workspaces with HIPAA compliance features enabled, compute plane refers to the classic compute plane in your own AWS account. As of this release, serverless compute features are disabled on a workspace with HIPAA compliance features enabled.

Key responsibilities of AWS include:

  • Perform its obligations as a business associate under your BAA with AWS.

  • Provide you the EC2 machines under your contract with AWS that support HIPAA compliance.

  • Provide hardware-accelerated encryption at rest and in-transit encryption within the AWS Nitro Instances that is adequate under HIPAA.

  • Delete encryption keys and data when Databricks releases the EC2 instances.

Key responsibilities of Databricks include:

  • Encrypt in-transit PHI data that is transmitted to or from the control plane.

  • Encrypt PHI data at rest in the control plane

  • Limit the set of instance types to the AWS Nitro instance types that enforce in-transit encryption and encryption at rest. For the list of supported instance types, see AWS Nitro System and HIPAA compliance features. Databricks limits the instance types both in the account console and through the API.

  • Deprovision EC2 instances when you indicate in Databricks that they are to be deprovisioned, for example auto-termination or manual termination, so that AWS can wipe them.

Key responsibilities of yours:

  • Configure your workspace to use either customer-managed keys for managed services or the Store interactive notebook results in customer account feature.

  • Do not use preview features within Databricks to process PHI other than features listed in Preview features that are supported for processing of PHI data

  • Follow security best practices, such as disable unnecessary egress from the compute plane and use the Databricks secrets feature (or other similar functionality) to store access keys that provide access to PHI.

  • Enter into a business associate agreement with AWS to cover all data processed within the VPC where the EC2 instances are deployed.

  • Do not do something within a virtual machine that would be a violation of HIPAA. For example, direct Databricks to send unencrypted PHI to an endpoint.

  • Ensure that all data that may contain PHI is encrypted at rest when you store it in locations that the Databricks platform may interact with. This includes setting the encryption settings on each workspace’s root S3 bucket that is part of workspace creation. You are responsible for ensuring the encryption (as well as performing backups) for this storage and all other data sources.

  • Ensure that all data that may contain PHI is encrypted in transit between Databricks and any of your data storage locations or external locations you access from a compute plane machine. For example, any APIs that you use in a notebook that might connect to external data source must use appropriate encryption on any outgoing connections.

  • Ensure that all data that may contain PHI is encrypted at rest when you store it in locations that the Databricks platform may interact with. This includes setting the encryption settings on each workspace’s root storage that is part of workspace creation.

  • Ensure the encryption (as well as performing backups) for your root S3 bucket and all other data sources.

  • Ensure that all data that may contain PHI is encrypted in transit between Databricks and any of your data storage locations or external locations you access from a compute plane machine. For example, any APIs that you use in a notebook that might connect to external data source must use appropriate encryption on any outgoing connections.

Note the following about customer-managed keys:

  • You can add customer-managed keys for your workspace’s root S3 bucket using the customer-managed keys for workspace storage feature, but Databricks does not require you to do so.

  • As an optional part of the customer-managed keys for workspace storage feature, you can add customer-managed keys for EBS volumes, but this is not necessary for HIPAA compliance.

Legacy HIPAA support for cluster creation

If you are an existing HIPAA customer and your workspace is not on the E2 version of the Databricks platform, to create a cluster, see the legacy article Create and verify a cluster for legacy HIPAA support.