Compliance security profile

This article describes the compliance security profile and its compliance controls.

Compliance security profile overview

The compliance security profile enables additional monitoring, enforced instance types for inter-node encryption, a hardened compute image, and other features and controls on Databricks workspaces. Enabling the compliance security profile is required to use Databricks to process data that is regulated under the following compliance standards:

You can also choose to enable the compliance security profile for its enhanced security features without the need to conform to a compliance standard.

Important

  • You are solely responsible for ensuring your own compliance with all applicable laws and regulations.

  • You are solely responsible for ensuring that the compliance security profile and the appropriate compliance standards are configured before processing regulated data.

  • If you add HIPAA, it is your responsibility before you process PHI data to have a BAA agreement with Databricks.

Which compute resources get enhanced security

The compliance security profile enhancements apply to compute resources in the classic compute plane in all regions.

Serverless SQL warehouse support for the compliance security profile varies by region. See Serverless SQL warehouses support the compliance security profile in some regions.

Compliance security profile features and technical controls

Security enhancements include:

  • An enhanced hardened operating system image based on Ubuntu Advantage.

    Ubuntu Advantage is a package of enterprise security and support for open source infrastructure and applications that includes the following:

  • Automatic cluster update is automatically enabled.

    Clusters are restarted to get the latest updates periodically during a maintenance window that you can configure. See Automatic cluster update.

  • Enhanced securing monitoring is automatically enabled.

    Security monitoring agents generate logs that you can review. For more information on the monitoring agents, see Monitoring agents in Databricks compute plane images.

  • Enforced use of AWS Nitro instance types in cluster and Databricks SQL SQL warehouses.

  • Communications for egress use TLS 1.2 or higher, including connecting to the metastore.

Requirements

  • Your Databricks account must include the Enhanced Security and Compliance add-on. For details, see the pricing page.

  • Your Databricks workspace is on the Enterprise pricing tier.

  • Single sign-on (SSO) authentication is configured for the workspace.

  • Your Databricks workspace’s root S3 bucket cannot have a period character (.) in its name, such as my-bucket-1.0. If an existing workspace’s root S3 bucket has a period character in the name, contact your Databricks account team before enabling the compliance security profile.

  • Instance types are limited to those that provide both hardware-implemented network encryption between cluster nodes and encryption at rest for local disks. The supported instance types are:

    • General purpose: M-fleet, Md-fleet, M5dn, M5n, M5zn, M7g, M7gd, M6i, M7i, M6id, M6in, M6idn, M6a, M7a

    • Compute optimized: C5a, C5ad, C5n, C6gn, C7g, C7gd, C7gn, C6i, C6id, C7i, C6in, C6a, C7a

    • Memory optimized: R-fleet, Rd-fleet, R7g, R7gd, R6i, R7i, R7iz, R6id, R6in, R6idn, R6a, R7a

    • Storage optimized: D3, D3en, P3dn, R5dn, R5n, I4i, I4g, I3en, Im4gn, Is4gen

    • Accelerated computing: G4dn, G5, P4d, P4de, P5

Note

Fleet instances are not available in AWS Gov Cloud.

Step 1: Prepare a workspace for the compliance security profile

Follow these steps when you create a new workspaces with the security profile enabled or enable it on an existing workspace.

  1. Check your workspace for long-running clusters before you enable the compliance security profile. When you enable the compliance security profile, long-running clusters are automatically restarted during the configured frequency and window of automatic cluster update. See Automatic cluster update.

  2. Ensure that single sign-on (SSO) authentication is configured. See Set up SSO in your Databricks account console.

  3. Add required network ports. The required network ports depend on if PrivateLink back-end connection for private connectivity for the classic compute plane is enabled or note.

    • PrivateLink back-end connectivity enabled: You must update your network security group to allow bidrectional access to port 2443 for FIPS encryption connections. For more information, see Step 1: Configure AWS network objects.

    • No PrivateLink back-end connectivity: You must update your network security group to allow outbound access to port 2443 to support FIPS encryption endpoints. See Security groups.

  4. If your workspace is in the US East, the US West, or the Canada (Central) region, and is configured to restrict outbound network access, you must to allow traffic to additional endpoints to support FIPS endpoints for the S3 service. This applies to the S3 service but not to STS and Kinesis endpoints. AWS does not yet provide FIPS endpoints for STS and Kinesis.

    • For S3, allow outgoing traffic to the endpoint s3.<region>.amazonaws.com and s3-fips.<region>.amazonaws.com. For example s3.us-east-1.amazonaws.com and s3-fips.us-east-1.amazonaws.com.

  5. Run the following tests to verify that the changes were correctly applied:

    1. Launch a Databricks cluster with 1 driver and 1 worker, any DBR version, and any instance type.

    2. Create a notebook attached to the cluster. Use this cluster for the following tests.

    3. In the notebook, validate DBFS connectivity by running:

      %fs ls /
      %sh ls /dbfs
      

      Confirm that a file listing appears without errors.

    4. Confirm access to the control plane instance for your region. Get the address from the table IP addresses and domains and look for the Webapp endpoint for your VPC region.

      %sh nc -zv <webapp-domain-name> 443
      

      For example, for VPC region us-west-2:

      %sh nc -zv oregon.cloud.databricks.com 443
      

      Confirm the result says it succeeded.

    5. Confirm access to the SCC relay for your region. Get the address from the table IP addresses and domains and look for the SCC relay endpoint for your VPC region.

      %sh nc -zv <scc-relay-domain-name> 2443
      

      For example, for VPC region us-west-1:

      %sh nc -zv tunnel.cloud.databricks.com 2443
      

      Confirm that the results says it succeeded.

    6. If your workspace in the US East region, the US West region, or Canada (Central) region, confirm access to the S3 endpoints for your region.

      %sh nc -zv <bucket-name>.s3-fips.<region>.amazonaws.com 443
      

      For example, for VPC region us-west-1:

      %sh nc -zv acme-company-bucket.s3-fips.us-west-1.amazonaws.com 443
      

      Confirm the results for all three commands indicate success.

    7. In the same notebook, validate that the cluster Spark config points to the desired endpoints. For example:

      >>> spark.conf.get("fs.s3a.stsAssumeRole.stsEndpoint")
      "sts.us-west-1.amazonaws.com"
      
      >>> spark.conf.get("fs.s3a.endpoint")
      "s3-fips.us-west-2.amazonaws.com"
      
  6. Confirm that all existing compute in all affected workspaces use only the instance types that are supported by the compliance security profile, listed in Requirements above.

    Any workload with an instance type outside of the list above would result in compute failing to startup with an invalid_parameter_exception.

Step 2: Enable the compliance security profile on a workspace

Note

Databricks Assistant is disabled by default on workspaces that have enabled the compliance security profile. Workspace admins can enable it by following the instructions Enable or disable Databricks Assistant.

  1. Enable the compliance security profile.

    To directly enable the compliance security profile on a workspace and optionally add compliance standards, see Enable enhanced security and compliance features on a workspace.

    You can also set an account-level default for new workspaces to enable the security profile and optionally choose to add compliance standards on new workspaces. See Set account-level defaults for new workspaces.

    Updates might take up to six hours to propagate to all environments. Workloads that are actively running continue with the settings that were active at the time of starting the compute resource, and new settings apply the next time these workloads are started.

  2. Restart all running compute.

Step 3: Confirm that the compliance security profile is enabled for a workspace

To confirm that a workspace is using the compliance security profile, check that it has the yellow shield logo displayed in the user interface.

  • A shield logo appears in the top-right of the page, to the left of the workspace name:

    Shield logo small.
  • Click the workspace name to see a list of the workspaces that you have access to. The workspaces that enable the compliance security profile have a shield icon followed by the text “Compliance security profile”.

    Shield logo large.

You can also confirm a workspace is using the compliance security profile from the Security and compliance tab on the workspace page in the account console.

Shield account.

If the shield icons are missing for a workspace with the compliance security profile enabled, contact your Databricks account team.