Cluster Log Delivery

When you create a cluster, you can specify a location to deliver Spark driver and worker logs. Logs are delivered every five minutes to your chosen destination. When a cluster is terminated, Databricks guarantees to deliver all logs generated up until the cluster was terminated.

The destination of the logs depends on the cluster ID. If the specified destination is dbfs:/cluster-log-delivery, cluster logs for 0630-191345-leap375 are delivered to dbfs:/cluster-log-delivery/0630-191345-leap375.

To configure the log delivery location:

  1. On the cluster configuration page, click the Advanced Options toggle.

  2. At the bottom of the page, click the Logging tab.

    Cluster_Log_Delivery
  3. Select a destination type.

  4. Enter the cluster log path.

S3 bucket destinations

If you choose an S3 destination, you must configure the cluster with an IAM role that can access the bucket. This IAM role must have both the PutObject and PutObjectAcl permissions. An example IAM role has been included below for your convenience. See Secure Access to S3 Buckets Using IAM Roles for instructions on how to set up an IAM role.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:ListBucket"
      ],
      "Resource": [
        "arn:aws:s3:::<my-s3-bucket>"
      ]
    },
    {
      "Effect": "Allow",
      "Action": [
        "s3:PutObject",
        "s3:PutObjectAcl",
        "s3:GetObject",
        "s3:DeleteObject"
      ],
      "Resource": [
        "arn:aws:s3:::<my-s3-bucket>/*"
      ]
    }
  ]
}

Note

This feature is also available in the REST API. See Clusters API and Cluster log delivery examples.