Skip to main content

Connect to a Cloudflare R2 external location

This page describes how to connect to a Cloudflare R2 external location. After you connect, you can govern access to these R2 objects using Unity Catalog.

To successfully connect to a Cloudflare R2 path, you need two Unity Catalog securable objects. The first is a storage credential, which specifies an R2 API token that allows access to the R2 location. You need this storage credential for the second required object: an external location, which defines the path to your R2 storage location and the credentials required to access that location.

Requirements

  • Databricks workspace enabled for Unity Catalog.

  • Databricks Runtime 14.3 or above, or SQL warehouse 2024.15 or above.

    If you encounter the error message No FileSystem for scheme "r2”, your compute is probably on an unsupported version.

  • A Cloudflare account. See https://dash.cloudflare.com/sign-up.

  • A Cloudflare R2 Admin role. See the Cloudflare roles documentation.

  • CREATE STORAGE CREDENTIAL and CREATE EXTERNAL LOCATION privileges on the Unity Catalog metastore attached to the workspace. Account admins and metastore admins have these privileges by default.

Step 1: Configure an R2 bucket

  1. Create a Cloudflare R2 bucket.

    You can use the Cloudflare dashboard or the Cloudflare Wrangler tool.

    See the Cloudflare R2 “Get started” documentation or the Wrangler documentation.

  2. Create an R2 API Token and apply it to the bucket.

    See the Cloudflare R2 API authentication documentation.

    Set the following token properties:

    • Permissions: Object Read & Write.

      This permission grants read and write access, which is required when you use R2 storage as a replication target, as described in Use Cloudflare R2 replicas or migrate storage to R2.

      If you want to enforce read-only access from Databricks to the R2 bucket, you can instead create a token that grants read access only. However, this may be unnecessary, because you can mark the storage credential as read-only, and any write access granted by this permission will be ignored.

    • (Optional) TTL: The length of time that you want to share the bucket data with the data recipients.

    • (Optional) Client IP Address Filtering: Select if you want to limit network access to specified recipient IP addresses. If this option is enabled, you must specify your recipients' IP addresses and you must allowlist the Databricks control plane NAT IP address for the workspace region.

    See Outbound IPs from Databricks control plane.

  3. Copy the R2 API token values:

    • Access Key ID
    • Secret Access Key
    important

    Token values are shown only once.

  4. On the R2 homepage, go to Account details and copy the R2 account ID.

Step 2: Create the storage credential

  1. In Databricks, log in to your workspace.

  2. Click Data icon. Catalog.

  3. On the Quick access page, click the External data > button, go to the Credentials tab, and select Create credential.

  4. Select Storage credential.

  5. Select a Credential Type of Cloudflare API Token.

  6. Enter a name for the credential and the following values that you copied when you configured the R2 bucket:

    • Account ID
    • Access key ID
    • Secret access key
  7. (Optional) If you want users to have read-only access to the external locations that use this storage credential, in Advanced options select Read only.

    Do not select this option if you want to use the storage credential to access R2 storage that you are using as a replication target, as described in Use Cloudflare R2 replicas or migrate storage to R2.

    For more information, see Mark a storage credential as read-only.

  8. Click Create.

  9. In the Storage credential created dialog, copy the External ID.

  10. (Optional) Bind the storage credential to specific workspaces.

    By default, a storage credential can be used by any privileged user on any workspace attached to the metastore. If you want to allow access only from specific workspaces, go to the Workspaces tab and assign workspaces. See Assign a storage credential to specific workspaces.

Step 3: Create the external location

To create the external location, use Catalog Explorer if you prefer using a graphical UI, or SQL if you prefer programmatic creation.

Option 1: Create an external location using Catalog Explorer

  1. Log in to a workspace that is attached to the metastore.

  2. In the sidebar, click Data icon. Catalog.

  3. On the Quick access page, click the External data > button, go to the External Locations tab, and click Create external location.

  4. On the Create a new external location dialog, click Manual, then Next.

    You cannot use the AWS Quickstart option to create an external location for DBFS root.

  5. On the Create a new external location manually dialog, enter an External location name.

  6. Under Storage type, select R2.

  7. Under URL, enter the path. For example, r2://my-bucket@my-account-id.r2.cloudflarestorage.com.

  8. Under Storage credential, select the storage credential that grants access to the external location.

  9. (Optional) If you want users to have read-only access to the external location, click Advanced Options and select Limit to read-only use. You can change this setting later. For more information, see Mark an external location as read-only.

  10. (Optional) If the external location is intended for legacy workload migration, click Advanced options and enable Fallback mode.

    See Enable fallback mode on external locations.

  11. Click Create.

  12. (Optional) Bind the external location to specific workspaces.

    By default, any privileged user can use the external location on any workspace attached to the metastore. If you want to allow access only from specific workspaces, go to the Workspaces tab and assign workspaces. See Assign an external location to specific workspaces.

  13. Go to the Permissions tab to grant permission to use the external location.

    For anyone to use the external location you must grant permissions:

    • To use the external location to add a managed storage location to metastore, catalog, or schema, grant the CREATE MANAGED LOCATION privilege.

    • To create external tables or volumes, grant CREATE EXTERNAL TABLE or CREATE EXTERNAL VOLUME.

    1. Click Grant.
    2. On the Grant on <external location> dialog, select users, groups, or service principals in Principals field, and select the privilege you want to grant.
    3. Click Grant.

Option 2: Create an external location using SQL

To create an external location using SQL, run the following command in a notebook or the SQL query editor. Replace the placeholder values.

  • <location-name>: A name for the external location. If location_name includes special characters, such as hyphens (-), it must be surrounded by backticks (` `). See Names.
  • <bucket-path>: The path in your cloud tenant that this external location grants access to. For example, r2://my-bucket@my-account-id.r2.cloudflarestorage.com.
  • <storage-credential-name>: The name of the storage credential that authorizes reading from and writing to the bucket. If the storage credential name includes special characters, such as hyphens (-), it must be surrounded by backticks (` `).
SQL
CREATE EXTERNAL LOCATION [IF NOT EXISTS] `<location-name>`
URL '<bucket-path>'
WITH ([STORAGE] CREDENTIAL `<storage-credential-name>`)
[COMMENT '<comment-string>'];

If you want to limit external location access to specific workspaces in your account, also known as workspace binding or external location isolation, see Assign an external location to specific workspaces.