Skip to main content

Using the AWS CloudFormation Quickstart template to connect to AWS S3

This page describes how to connect to an AWS S3 bucket by creating an external location object using a provided AWS CloudFormation Quickstart template. When you create an external location using the AWS CloudFormation template, Databricks configures the external location and creates a storage credential for you.

Databricks recommends this method if you want to quickly set up a new connection to an S3 location, and you don't already have an existing storage credential. For an overview of other supported methods, see Connect to an AWS S3 external location.

Before you begin

Prerequisites:

You must create the S3 bucket that you want to use as an external location in AWS before you create the external location object in Databricks.

  • Do not use dot notation (for example, incorrect.bucket.name.notation) in S3 bucket names. Although AWS allows dots in bucket names, Databricks does not support S3 buckets with dot notation. Buckets containing dots can cause compatibility issues with features like Delta Sharing due to SSL certificate validation failures. For more information, see the AWS bucket naming best practices.

  • External location paths must contain only standard ASCII characters (letters A–Z, a–z, digits 0–9, and common symbols like /, _, -).

  • The bucket cannot have an S3 access control list attached to it.

  • Avoid using a path in S3 that is already defined as an external location in another Unity Catalog metastore. You can safely read data in a single external S3 location from more than one metastore, but concurrent writes to the same S3 location from multiple metastores can lead to consistency issues.

Databricks permissions requirements:

  • You must have the CREATE STORAGE CREDENTIAL privilege on the metastore. Metastore admins have CREATE STORAGE CREDENTIAL on the metastore by default.
  • You must have the CREATE EXTERNAL LOCATION privilege on both the metastore and the storage credential referenced in the external location. Metastore admins have CREATE EXTERNAL LOCATION on the metastore by default.

AWS permissions requirements:

  • You must have iam:CreateRole permissions to create the IAM role.

Step 1: Create an external location for an S3 bucket using an AWS CloudFormation template

  1. Log in to a workspace that is attached to the metastore.

  2. Click Data icon. Catalog to open Catalog Explorer.

  3. Click Add or plus icon, then click Create an external location.

  4. On the Create a new external location dialog, select AWS Quickstart (Recommended) then click Next.

    The AWS Quickstart configures the external location and creates a storage credential for you. If you choose to use the Manual option, you must manually create an IAM role that gives access to the S3 bucket and create the storage credential in Databricks yourself.

  5. On the Create external location with Quickstart dialog, enter the path to the S3 bucket in the Bucket Name field.

  6. Click Generate new token to generate the personal access token that you will use to authenticate between Databricks and your AWS account.

  7. Copy the token and click Launch in Quickstart.

  8. In the AWS CloudFormation template that launches (labeled Quick create stack), paste the token into the Databricks Personal Access Token field.

  9. Accept the terms at the bottom of the page (I acknowledge that AWS CloudFormation might create IAM resources with custom names).

  10. Click Create stack.

    It may take a few minutes for the CloudFormation template to finish creating the external location object in Databricks. If the CloudFormation template fails to create the external location, verify that your role has iam:CreateRole permissions.

Step 2: Verify the external location

  1. In your Databricks workspace, click Catalog to open Catalog Explorer.

  2. On the Quick access page, click the External data > button.

  3. In the External Locations tab, confirm that your external location has been created.

    Automatically-generated external locations use the naming syntax db_s3_external_databricks-S3-ingest-<id>.

  4. Click the name of the external location, then click the Browse tab. As the owner of the external location, you should see your files from the S3 bucket in Catalog Explorer.

  5. (Optional) Bind the external location to specific workspaces.

    By default, any privileged user can use the external location on any workspace attached to the metastore. If you want to allow access only from specific workspaces, go to the Workspaces tab and assign workspaces. See Assign an external location to specific workspaces.

  6. Grant permission to use the external location.

    For anyone to use the external location, you must grant permissions:

    • To use the external location to add a managed storage location to metastore, catalog, or schema, grant the CREATE MANAGED LOCATION privilege.
    • To create external tables or volumes, grant CREATE EXTERNAL TABLE or CREATE EXTERNAL VOLUME.

    To use Catalog Explorer to grant permissions:

    1. Click the external location name to open the details pane.
    2. On the Permissions tab, click Grant.
    3. On the Grant on <external location> dialog, select users, groups, or service principals in Principals field, and select the privilege you want to grant.
    4. Click Grant.