Share data using Delta Sharing (Preview)

Preview

Delta Sharing is in Public Preview. To participate in the preview, you must enable the External Data Sharing feature group in the Databricks Account Console. See Enable the External Data Sharing feature group for your account.

Delta Sharing is subject to additional Service Specific Terms. Enabling the External Data Sharing feature group represents acceptance of those service terms.

Delta Sharing (Preview) is an open protocol developed by Databricks for secure data sharing with other organizations regardless of which computing platforms they use. A Databricks user, called a “data provider”, can use Delta Sharing to share data with a person or group outside of their organization, called a “data recipient”. Data recipients can immediately begin working with the latest version of the shared data. For a full list of connectors and information about how to use them, see the Delta Sharing documentation. When Delta Sharing is enabled on a metastore, Unity Catalog runs a Delta Sharing server.

Delta Sharing architectural overview

The shared data is not provided by Databricks directly but by data providers running on Databricks.

Note

By accessing a data provider’s shared data as a data recipient, data recipient represents that it has been authorized to access the data share(s) provided to it by the data provider and acknowledges that (1) Databricks has no liability for such data or data recipient’s use of such shared data, and (2) Databricks may collect information about data recipient’s use of and access to the shared data (including identifying any individual or company who accesses the data using the credential file in connection with such information) and may share it with the applicable data provider.

This article shows how to share data in Unity Catalog with data recipients outside your organization. If you are a data recipient, see Access data shared with you using Delta Sharing.

Requirements

  • Unity Catalog must be enabled, and at least one metastore must exist.

  • Only an account admin can enable Delta Sharing for a metastore.

  • Only a metastore admin or account admin can share data using Delta Sharing. See Metastore admin.

  • To rotate a recipient’s credential, you must use the Unity Catalog CLI. See (Optional) Install the Unity Catalog CLI.

  • To manage shares and recipients, you can use a Data Science & Engineering notebook or a Databricks SQL query.

Key concepts

In Delta Sharing, a share is a read-only collection of tables and table partitions to be shared with one or more recipients. A metastore can have multiple shares, and you can control which recipients have access to each share. A single metastore can contain multiple shares, but each share can belong to only one metastore. If you remove a share, all recipients of that share lose the ability to access it.

A recipient is an object that associates an organization with a credential that allows the organization to access one or more shares. When you create a recipient, a downloadable credential is generated for that recipient. Each metastore can have multiple recipients, but each recipient can belong to only one metastore. A recipient can have access to multiple shares. If you remove a recipient, that recipient loses access to all shares it could previously access.

Shares and recipients exist independent of each other.

Shares and recipients in Delta Sharing

To grant a recipient access to a share, you:

  1. Join the Delta Sharing Public Preview by enabling the External Data Sharing feature group for your Databricks account.

  2. Enable Delta Sharing for each Unity Catalog metastore.

  3. Create a share associated with one or more tables in the metastore.

  4. Create a recipient. A set of credentials is generated for that recipient.

  5. Grant the recipient privileges on one or more shares.

  6. Using a secure channel, send the recipient an activation link that allows them to download their credentials.

  7. The recipient uses the credential to access the share.

The following sections show how to enable Delta Sharing for a Databricks account, enable Delta Sharing on a metastore, manage shares and recipients, and audit Delta Sharing activity.

Enable the External Data Sharing feature group for your account

To participate in the Delta Sharing Public Preview, an account admin must enable the External Data Sharing feature group for your Databricks account. Enabling this feature group represents acceptance of additional Service Specific Terms.

  1. As a Databricks account admin, log in to the Account Console.

  2. In the sidebar, click Settings icon Settings.

  3. Go to the Feature enablement tab.

  4. On the External Data Sharing Feature Group row, click the Enable button.

    Click the Terms link to review the additional service specific terms for Delta Sharing. Clicking Enable represents acceptance of these terms.

Enable Delta Sharing on a metastore

Follow these steps for each metastore where you plan to share data using Delta Sharing.

  1. Log in to the account console.

  2. In the sidebar, click Data Icon Data.

  3. Click the name of a metastore to open its details.

  4. Click the checkbox next to Enable Delta Sharing and allow a Databricks user to share data outside their organization.

  5. Databricks recommends that you configure the default recipient token lifetime. This is the time after which recipient tokens expire and must be regenerated. If you do not configure this setting, recipient tokens do not expire.

    When you change the default recipient lifetime, the recipient token lifetime for existing recipients is not automatically updated. To update the recipient token lifetime for a given recipient, see Rotate a recipient’s credential.

    To configure the default recipient token lifetime:

    1. Enable Set expiration.

    2. Enter a number of seconds, minutes, hours, or days, and select the unit of measure.

    3. Click Enable.

    For more information, see Security recommendations for recipients.

(Optional) Install the Unity Catalog CLI

To manage shares and recipients, you can use SQL commands or the Unity Catalog CLI. The CLI runs in your local environment and does not require Databricks compute resources.

To install the CLI, see (Optional) Install the Unity Catalog CLI in the Unity Catalog documentation.

Modify the recipient token lifetime

When Delta Sharing is enabled, follow these steps to modify the default recipient token lifetime.

Note

The recipient token lifetime for existing recipients is not automatically updated when you change the default recipient token lifetime for a metastore. To update the recipient token lifetime for a given recipient, see Rotate a recipient’s credential.

  1. Log in to the account console.

  1. In the sidebar, click Data Icon Data.

  2. Click the name of a metastore to open its details.

  3. Enable Set expiration.

  4. Enter a number of seconds, minutes, hours, or days, and select the unit of measure.

  5. Click Enable.

If you disable Set expiration, recipient tokens do not expire. For more information, see Security recommendations for recipients.

Manage shares

In Delta Sharing, a share is a named object that contains a collection of tables in a metastore that you want to share as a group. A share can contain tables from only a single metastore. You can add or remove tables from a share at any time.

The following sections show how to create, describe, update and delete shares.

Create a share

To create a share, run the following command in a notebook or the Databricks SQL editor. Replace the placeholder values:

  • <share_name>: a descriptive name for the share.

  • <comment>: a comment describing the share.

CREATE SHARE [IF NOT EXISTS] <share_name> [COMMENT <comment>];

After you create a share, you can add tables to it and associate it with one or more recipients.

List shares

To list all shares, use the SHOW SHARES command.

SHOW SHARES;

Describe a share

To list a share’s metadata and all tables associated with a share, use the DESCRIBE SHARE command. Replace <share_name> with the name of the share:

DESCRIBE SHARE <share_name>;

To list all tables in a share, use the SHOW ALL IN SHARES command.

SHOW ALL IN SHARE <share_name>;

Add or remove tables from a share

In the following commands, replace the placeholder values:

  • <share_name>: A name for the share.

  • <catalog_name>: The name of the catalog in the metastore.

  • <schema_name>: The name of the schema in the metastore.

  • <table_name>: The name of the table to add.

  • AS <new_schema_name.<new_table_name>: Optionally share the table with a new name.

  • <comment>: A comment describing the share.

To associate a table with the share, use the ALTER SHARE ADD TABLE command.

ALTER SHARE <share_name>
ADD TABLE <catalog_name>.<schema_name>.<table_name> [deltaSharingPartitionListSpec]
[AS <new_schema_name.<new_table_name>]
[COMMENT <comment>];

For details about sharing a partition, see Partition specifications.

To remove a table from a share, use the ALTER SHARE REMOVE TABLE command.

ALTER SHARE <share_name> REMOVE TABLE <catalog_name>.<schema_name>.<table_name>;

Note

You can provide either the original schema and table name in the metastore or the names defined in the share.

When you add or remove tables from a share, the change takes effect the next time a recipient accesses the share.

Partition specifications

To share only part of a table when adding the table to a share, you can provide a partition specification. The following example shares part of the data in the inventory table, given that the table is partitioned by year, month, and date columns.

  • Data for the year 2021.

  • Data for December 2020.

  • Data for December 25, 2019.

ALTER SHARE share_name
ADD TABLE inventory
PARTITION (year = "2021"),
          (year = "2020", month = "Dec"),
          (year = "2019", month = "Dec", date = "2019-12-25");

Delete a share

To delete a share, use the DROP SHARE command. Recipients can no longer access data that was previously shared. Replace <share_name> with the name of the share.

DROP SHARE [IF EXISTS] <share_name>;

Manage recipients

A recipient is a named set of credentials that represents an organization with whom to share data. This section shows how to manage recipients in Delta Sharing.

Create a recipient

Use the CREATE RECIPIENT command to create a recipient. Replace the placeholder values:

  • <recipient_>: A descriptive name for the recipient.

  • <comment>: A comment with more information.

CREATE RECIPIENT [IF NOT EXISTS] <recipient_name> COMMENT <comment>

The recipient’s token will expire after the recipient token lifetime has elapsed. For more information, see Security recommendations for recipients.

After creating a recipient:

  1. Use the `DESCRIBE` command to get their activation link.

  2. Use a secure channel to share the activation link with them, along with the article showing how to access shared data. The activation link can be accessed only a single time. Recipients should treat the downloaded credential as a secret and must not share it outside of their organization. If necessary, you can rotate a recipient’s credential.

  3. Grant them access to shares.

List recipients

The SHOW RECIPIENTS command lists all recipients. Optionally, replace <pattern> with a `LIKE` predicate.

SHOW RECIPIENTS [LIKE <pattern>];

Describe a recipient

To view details about a recipient, including its creator, creation timestamp, token lifetime, activation link, and whether the credential has been downloaded, use the DESCRIBE RECIPIENT command. Replace <recipient_name> with the name of the recipient.

DESCRIBE RECIPIENT <recipient_name>;

To show grants to a recipient, see Manage privileges for a recipient.

Delete a recipient

To delete a recipient, use the DROP RECIPIENT command. Replace <recipient_name> with the name of the recipient to drop. When you drop a recipient, the credential is invalidated and they can no longer view shares.

DROP RECIPIENT [IF EXISTS] <recipient_name>

Rotate a recipient’s credential

You should rotate a recipient’s credential and generate a new activation link:

  • When the existing recipient token is about to expire.

  • If a recipient loses their activation URL or if it is compromised.

  • If the credential is corrupted, lost, or compromised after it is downloaded by a recipient.

  • To update the recipient’s token lifetime after you modify the recipient token lifetime for a metastore.

  1. If you have not already done so, install the Unity Catalog CLI.

  2. Run the following command using the Unity Catalog CLI. Arguments in brackets are optional. Replace the placeholder values:

    • <recipient_name>: the name of the recipient.

    • <expiration_seconds>: Optional. The number of seconds until the existing recipient token should expire. During this period, the existing token will continue to work. A value of 0 means the existing token expires immediately.

      uc rotate-recipient-token \
        --name <recipient_name> \
        [--existing-token-expire-in-seconds <expiration_seconds>]
      
  3. View the activation URL by using the DESCRIBE RECIPIENT <recipient_name> command, and share it with the recipient over a secure channel.

Manage privileges for a recipient

After you have created a share and a recipient, use GRANT and REVOKE statements to grant the recipient access to the share. In the following examples, replace the placeholder values:

  • <share_name>: the name of the share.

  • <recipient_name>: the name of the recipient

To show all grants on a share:

SHOW GRANT ON SHARE <share_name>;

To view all grants to a recipient:

SHOW GRANT TO RECIPIENT <recipient_name>;

To grant access:

GRANT SELECT
ON SHARE <share_name>
TO RECIPIENT <recipient_name>;

To revoke access:

REVOKE SELECT
ON SHARE <share_name>
FROM RECIPIENT <recipient_name>;

Note

SELECT is the only privilege you can grant on a share.

Security recommendations for recipients

When you enable Delta Sharing, you configure the token lifetime for recipient credentials. If you set the token lifetime to 0, recipient tokens never expire. Databricks recommends that you configure tokens to expire.

In the following situations, you should rotate a recipient’s credential:

At any given time, a recipient can have at most two tokens: an active token and a rotated token. Until the rotated token expires, attempting to rotate the token again results in an error.

When you rotate a recipient’s credential, you can optionally set --existing-token-expire-in-seconds to the number of seconds before the existing recipient token expires. If you set the value to 0, the existing recipient token expires immediately.

Databricks recommends that you set --existing-token-expire-in-seconds to a relatively short period that gives the recipient organization time to access the new activation URL while minimizing the amount of time that the recipient has two active tokens. If you suspect that the recipient token is compromised, Databricks recommends that you force the existing recipient token to expire immediately.

If a recipient’s existing activation URL has never been accessed and the recipient has not been rotated, rotating the recipient invalidates the existing activation URL and replaces it with a new one.

If all recipient tokens have expired, rotating the recipient replaces the existing activation URL with a new one. Databricks recommends that you promptly rotate or drop a recipient whose token has expired.

If a recipient activation link is inadvertently sent to the wrong person or is sent over an insecure channel, Databricks recommends that you:

  1. Revoke the recipient’s access to the share.

  2. Rotate the recipient and set --existing-token-expire-in-seconds to 0.

  3. Share the new activation link with the intended recipient over a secure channel.

  4. After the activation URL has been accessed, grant the recipient access to the share again.

In an extreme situation, Databricks recommends that you drop and re-create the recipient.

Audit access and activity for Delta Sharing resources

After you configure audit logging, Delta Sharing saves audit logs for activities such as when someone creates, modifies, updates, or deletes a share or a recipient, when a recipient accesses an activation link and downloads the credential, or when a recipient’s credential is rotated or expires. Delta Sharing activity is logged at the account level.

  1. Enable audit logs for your account.

    Important

    Delta Sharing activity is logged at the level of the account. Do not enter a value into workspace_ids_filter.

    Audit logs are delivered for each workspace in your account, as well as account-level activities. Logs are delivered to the S3 bucket you configure.

  1. Events for Delta Sharing are logged with serviceName set to unityCatalog. The requestParams section of each event includes a delta_sharing prefix.

    For example, the following audit event shows an update to the recipient token lifetime. In this example, redacted values are replaced with <redacted>.

    {
       "version":"2.0",
       "auditLevel":"ACCOUNT_LEVEL",
       "timestamp":1629775584891,
       "orgId":"3049059095686970",
       "shardName":"example-workspace",
       "accountId":"<redacted>",
       "sourceIPAddress":"<redacted>",
       "userAgent":"curl/7.64.1",
       "sessionId":"<redacted>",
       "userIdentity":{
          "email":"<redacted>",
          "subjectName":null
       },
       "serviceName":"unityCatalog",
       "actionName":"updateMetastore",
       "requestId":"<redacted>",
       "requestParams":{
           "id":"<redacted>",
           "delta_sharing_enabled":"true"
           "delta_sharing_recipient_token_lifetime_in_seconds": 31536000
        },
       "response":{
          "statusCode":200,
          "errorMessage":null,
          "result":null
       },
       "MAX_LOG_MESSAGE_LENGTH":16384
    }
    

The following table lists audited events for Delta Sharing, from the point of view of the data provider.

Note

The following important fields are always present in the audit log:

  • userIdentity.email: The ID of the user who initiated the activity.

  • requestParams.id: the Unity Catalog metastore.

actionName

requestParams

updateMetastore

delta_sharing_enabled: If present, indicates that Delta Sharing was enabled.

delta_sharing_recipient_token_lifetime_in_seconds: If present, indicates that the recipient token lifetime was updated.

createRecipient

name: The name of the recipient.

comment: The comment for the recipient.

deleteRecipient

name: The name of the recipient.

getRecipient

name: The name of the recipient.

listRecipients

none

rotateRecipientToken

name: The name of the recipient.

comment: The comment given in the rotation command.

createShare

name: The name of the share.

comment: The comment for the share.

deleteShare

name: The name of the share.

getShare

name: The name of the share.

include_shared_objects: Whether the share’s table names were included in the request.

updateShare

name: The name of the share.

updates: A JSON representation of tables that were added or removed from the share. Each item includes action (add or remove), name (the actual name of the table), shared_as (the name the schema and table were shared as, if different from name), and partition_specification (if a partition specification was provided).

listShares

none

getSharePermissions

name: The name of the share.

updateSharePermissions

name: The name of the share.

changes: A JSON representation of the updated permissions. Each change includes principal (the user or group to whom permission is granted or revoked), add (the list of permissions that were granted), remove (the list of permissions that were revoked).

getRecipientSharePermissions

name: The name of the share.

getActivationUrlInfo

recipient_name: The name of the recipient who opened the activation URL.

retrieveRecipientToken

recipient_name: The name of the recipient who downloaded the token.

The following Delta Sharing errors are logged, from the point of view of the data recipient. Items between < and > characters represent placeholder text.

  • Delta Sharing is not enabled on the selected metastore.

      DatabricksServiceException: FEATURE_DISABLED:
      Delta Sharing is not enabled`
    
  • An operation was attempted on a catalog that does not exist.

    DatabricksServiceException: CATALOG_DOES_NOT_EXIST:
    Catalog ‘xxx’ does not exist.`
    
  • A user who is not an account admin or metastore admin attempted to perform a privileged operation.

    DatabricksServiceException: PERMISSION_DENIED:
    Only administrators can <operation_name> <operation_target>
    
  • An operation was attempted on a metastore from a workspace to which the metastore is not assigned.

    DatabricksServiceException: INVALID_STATE:
    Workspace <workspace_name> is no longer assigned to this metastore
    
  • A request was missing the recipient name or share name.

    DatabricksServiceException: INVALID_PARAMETER_VALUE: CreateRecipient/CreateShare Missing required field: <recipient_name>/<share_name>
    
  • A request included an invalid recipient name or share name.

    DatabricksServiceException: INVALID_PARAMETER_VALUE: CreateRecipient/CreateShare <recipient_name>/<share_name> is not a valid name
    
  • DatabricksServiceException: INVALID_PARAMETER_VALUE: Only managed or external table on Unity Catalog can be added to a share.

    A user attempted to share a table that is not in a Unity Catalog metastore.

  • DatabricksServiceException: INVALID_PARAMETER_VALUE: There are already two active tokens for recipient <recipient_name>.

    A user attempted to rotate a recipient that was already in a rotated state and whose previous token had not yet expired.

  • DatabricksServiceException: RECIPIENT_ALREADY_EXISTS/SHARE_ALREADY_EXISTS: Recipient/Share <name> already exists

    A user attempted to create a new recipient or share with the same name as an existing one.

  • DatabricksServiceException: RECIPIENT_DOES_NOT_EXIST/SHARE_DOES_NOT_EXIST: Recipient/Share '<name>' does not exist.

    A user attempted to perform an operation on a recipient or share that does not exist.

  • DatabricksServiceException: RESOURCE_ALREADY_EXISTS: Shared Table '<name>' already exists.

    A user attempted to add a table to a share, but the table had already been added.

  • DatabricksServiceException: TABLE_DOES_NOT_EXIST: Table '<name>' does not exist.

    A user attempted to perform an operation that referenced a table that does not exist.

  • DatabricksServiceException: SCHEMA_DOES_NOT_EXIST: Schema '<name>' does not exist.

    A user attempted to perform an operation that referenced a schema that did not exist.

For auditable events and errors for data recipients, see Audit access and activity for Delta Sharing resources.

Limitations

  • Only tables stored in a Unity Catalog metastore can be shared with Delta Sharing.

  • Only managed and external tables in Delta format are supported.

  • Sharing views is not supported in this preview.

Next steps