Enhanced Security Monitoring
Enhanced Security Monitoring provides an enhanced disk image (a CIS-hardened Ubuntu Advantage AMI) and additional security monitoring agents that generate logs that you can review. Two of the monitor agents run on compute resources (cluster workers) in your workspace’s Classic data plane in your AWS account. This applies to clusters for notebooks and jobs, as well as disk images that are used for pro or classic SQL warehouses.
The data plane enhancements that are discussed in this document apply only to the Classic data plane in your AWS account. The additional security controls and monitoring do not apply to serverless compute, which runs compute resources in the serverless data plane in your Databricks account. For example, these new controls apply to pro and classic SQL warehouses, but do not apply to serverless SQL warehouses.
Requirements
Your Databricks workspace is on the E2 version of the platform.
Your Databricks workspace is on the Enterprise tier.
Enable Databricks Enhanced Security Monitoring
Contact your Databricks representative to request that Databricks enable the feature for your workspace.
Wait for confirmation that it is enabled for your workspace.
Terminate all compute resources in the workspace, such as clusters and SQL warehouses.
Restart your compute resources.
Disk image with enhanced hardening
While Databricks Enhanced Security Monitoring is enabled, Databricks compute resources (cluster worker images) in your Classic data plane use an enhanced hardened operating system image based on Ubuntu Advantage. Ubuntu Advantage is a package of enterprise security and support for open source infrastructure and applications that includes the following:
A CIS Level 1 hardened image
FIPS 140-2 Level 1 validated encryption modules
Monitoring agents in Databricks compute images
While Databricks Enhanced Security Monitoring is enabled, there are additional security monitoring agents, including two agents that are pre-installed in the images that are used for Databricks compute resource VMs. You cannot disable the monitoring agents that are in the enhanced disk image.
Monitoring agent |
Description |
How to get output |
---|---|---|
Capsule8 |
Monitors for file integrity and security boundary violations. This monitor agent runs on the worker VM in your cluster. |
Configure audit log delivery and review logs for new rows. |
ClamAV |
Scans the filesystem for viruses including daily on-host virus scanning. This monitor agent runs on the VMs in your compute resources such as clusters and pro or classic SQL warehouses. ClamAV scans the entire host OS filesystem and the Databricks Runtime container filesystem. Anything outside the cluster VMs is outside of its scanning scope. |
Configure audit log delivery and review logs for new rows. |
Qualys |
Scans the container host (VM) for certain known vulnerabilities and CVEs. The scanning happens in representative images in the Databricks environments. |
Request scan reports on the image from your Databricks representative. |
File integrity monitoring (Capsule8)
The data plane image includes Capsule8, a file integrity monitoring service that provides runtime visibility and threat detection for compute resources (cluster workers) in the Classic data plane in your account.
The Capsule8 monitoring output is generated within audit logs. To access these logs, an admin must set up audit log delivery to an Amazon S3 bucket. For the JSON schema for new auditable events that are specific to Capsule8, see Audit log schemas for Capsule8 and ClamAV.
Important
It is your responsibility to review Capsule8 logs. At the sole discretion of Databricks, Databricks may review these logs but does not make a commitment to do so. If the agent detects a malicious activity, it is your responsibility to triage these events and open a support ticket with Databricks if the resolution or remediation requires an action by Databricks. Databricks may take action on the basis of these logs, including suspending or terminating the resources, but does not make any commitment to do so.
Anti-virus and malware detection (ClamAV)
The enhanced data plane image includes ClamAV, an open source antivirus engine for detecting trojans, viruses, malware, and other malicious threats. ClamAV scans the entire host OS filesystem and the Databricks Runtime container filesystem. Anything outside the cluster VMs is outside of its scanning scope.
The ClamAV monitoring output is generated within audit logs. To access these logs, an admin must set up audit log delivery to an Amazon S3 bucket. For the JSON schema for new auditable events that are specific to ClamAV, see Audit log schemas for Capsule8 and ClamAV.
Important
It is your responsibility to review ClamAV logs. At the sole discretion of Databricks, Databricks may review these logs but does not make a commitment to do so. If the agent detects a malicious activity, it is your responsibility to triage these events and open a support ticket with Databricks if the resolution or remediation requires an action by Databricks. Databricks may take action on the basis of these logs, including suspending or terminating the resources, but does not make any commitment to do so.
When a new AMI is built, updated signature files are included within the new AMI.
Vulnerability scans (Qualys)
A monitor agent called Qualys performs vulnerability scans of the container host (VM) for certain known CVEs.
Important
The scanning happens in representative images in the Databricks environments.
You can request the Qualys scan reports from your Databricks representative.
When vulnerabilities are found via Qualys, Databricks tracks them against its Vulnerability Management SLA and releases an updated image when available. It is your responsibility to restart all compute resources regularly to keep the image up-to-date with the latest image version.
Management and upgrade of monitoring agents
The additional monitoring agents that are on the disk images used for the compute resources in the Classic data plane are part of the standard Databricks process for upgrading systems:
The Classic data plane base disk image (AMI) is owned, managed, and patched by Databricks.
Databricks delivers and applies security patches by releasing new disk images (AMIs). The delivery schedule depends on new functionality and the SLA for discovered vulnerabilities. Typical delivery is every 2-4 weeks.
The base operating system for the data plane is Ubuntu Advantage 18.04 LTS.
Databricks clusters and pro or classic SQL warehouses are ephemeral by default. Upon launch, clusters and pro or classic SQL warehouses use the latest available base image. Older versions that may have security vulnerabilities are unavailable for new clusters.
You are responsible for ensuring that you do not have long-running clusters.
You are responsible for restarting clusters (using the UI or API) regularly to ensure they use the latest patched host VM images.
Databricks can share upon request a Databricks notebook that lists your workspace’s running clusters and identifies hosts older than a specified number of days and optionally restart a cluster.
Monitor agent termination
If a monitor agent on the worker VM is found to be not running due to crash or other termination, the system will attempt to restart the agent.
Data retention policy for monitor agent data
ClamAV and Capsule8 logs are sent to your own Amazon S3 bucket as part of audit log delivery. Retention, ingestion, and analysis of these logs is your responsibility.
Qualys vulnerability reports and logs are retained for at least one year by Databricks in the Qualys SaaS platform. You can request the vulnerability reports if needed. You can request the logs from your Databricks representative.