Share data securely using Delta Sharing

This article introduces Delta Sharing in Databricks, the secure data sharing platform that lets you share data in Databricks with users outside your organization.

The Delta Sharing articles on this site focus on sharing Databricks data. Delta Sharing is also available as an open-source project that you can use to share Delta tables from other platforms.

Note

If you are a data recipient who has been granted access to shared data through Delta Sharing, and you just want to learn how to access that data, see Access data shared with you using Delta Sharing.

What is Delta Sharing?

Delta Sharing is an open protocol developed by Databricks for secure data sharing with other organizations regardless of the computing platforms they use. Databricks builds Delta Sharing into its Unity Catalog data governance platform, enabling a Databricks user, called a data provider, to share data with a person or group outside of their organization, called a data recipient.

Delta Sharing’s native integration with Unity Catalog allows you to manage, govern, audit, and track usage of the shared data on one platform. In fact, your data must be registered in Unity Catalog to be available for secure sharing. Data must also be in the Delta table format.

Delta Sharing workflow

Shares and recipients

The primary concepts underlying Delta Sharing in Databricks are shares and recipients.

Shares and recipients in Delta Sharing

What is a share?

In Delta Sharing, a share is a read-only collection of tables and table partitions to be shared with one or more recipients.

A share is a securable object registered in Unity Catalog. A share can contain tables from a single Unity Catalog metastore. You can add or remove tables from a share at any time, and you can assign or revoke data recipient access to a share at any time.

If you remove a share from your Unity Catalog metastore, all recipients of that share lose the ability to access it.

See Create and manage shares for Delta Sharing.

What is a recipient?

A recipient is an object that associates an organization with a credential or secure sharing identifier that allows that organization to access one or more shares.

As a data provider (sharer), you can define multiple recipients for any given Unity Catalog metastore, but if you want to share data from multiple metastores with a particular user or group of users, you must define the recipient separately for each metastore. A recipient can have access to multiple shares.

If you delete a recipient from your Unity Catalog metastore, that recipient loses access to all shares it could previously access.

See Create and manage data recipients for Delta Sharing.

Open sharing versus Databricks-to-Databricks sharing

The way you use Delta Sharing depends on who you are sharing data with:

  • Open sharing lets you share data with any user, whether or not they have access to Databricks.

  • Databricks-to-Databricks sharing lets you share data with Databricks users who have access to a Unity Catalog metastore that is different from yours.

What is open Delta Sharing?

If you want to share data with users outside of your Databricks workspace, regardless of whether they use Databricks, you can use open Delta Sharing to share your data securely. As a data provider, you generate a token and share it securely with the recipient. They use the token to authenticate and get read access to the tables you’ve included in the shares you’ve given them access to.

Recipients can access the shared data using many computing tools and platforms, including:

  • Databricks

  • Apache Spark

  • Pandas

  • Power BI

For a full list of Delta Sharing connectors and information about how to use them, see the Delta Sharing documentation.

See also Share data using the Delta Sharing open sharing protocol.

What is Databricks-to-Databricks Delta Sharing?

If you want to share data with users who don’t have access to your Unity Catalog metastore, you can use Databricks-to-Databricks Delta Sharing, as long as the recipients have access to a Databricks workspace that is enabled for Unity Catalog. Databricks-to-Databricks sharing lets you share data with users in other Databricks accounts, whether they’re on AWS or Azure, and it’s a great way to securely share data across different Unity Catalog metastores in your own Databricks account.

The advantage of this scenario is that the share recipient doesn’t need a token to access the share, and the provider doesn’t need to manage recipient tokens. The security of the sharing connection—including all identity verification, authentication, and auditing—is managed entirely through Delta Sharing and the Databricks platform.

See also Share data using the Delta Sharing Databricks-to-Databricks protocol.

How do admins set up Delta Sharing?

Databricks-to-Databricks sharing between Unity Catalog metastores in the same account is always enabled. To enable Delta Sharing to share data with Databricks workspaces in other accounts or non-Databricks clients, a Databricks account admin or metastore admin performs the following setup steps (at a high level):

  1. Enable the External Data Sharing feature group for your Databricks account.

    See Enable Delta Sharing for your account.

  2. Enable Delta Sharing for the Unity Catalog metastore that manages the data you want to share.

    Note

    You do not need to enable Delta Sharing on your metastore if you intend to use Delta Sharing to share data only with users on other Unity Catalog metastores in your account. Metastore-to-metastore sharing within a single Databricks account is enabled by default.

    See Enable Delta Sharing on a metastore.

  3. Create a share that includes one or more tables in the metastore.

    See Create and manage shares for Delta Sharing.

  4. Create a recipient.

    See Create and manage data recipients for Delta Sharing.

    If your recipient is not a Databricks user, or does not have access to a Databricks workspace that is enabled for Unity Catalog, you must use open sharing. A set of token-based credentials is generated for that recipient.

    If your recipient has access to a Databricks workspace that is enabled for Unity Catalog, you can use Databricks-to-Databricks sharing, and no token-based credentials are required. You request a sharing identifier from the recipient and use it to establish the secure connection.

    Tip

    Use yourself as a test recipient to try out the setup process.

  5. Grant the recipient access to one or more shares.

    See Grant and manage access to Delta Sharing data shares.

  6. Send the recipient the information they need to connect to the share.

    See Send the recipient their connection information.

    For open sharing, use a secure channel to send the recipient an activation link that allows them to download their token-based credentials.

    For Databricks-to-Databricks sharing, the data included in the share becomes available in the recipient’s Databricks workspace as soon as you grant them access to the share.

The recipient can now access the shared data.

How do recipients access the shared data?

Recipients access the shared data in read-only format. Secure access depends on the sharing model:

Whenever the data provider updates data tables in their own Databricks account, the updates appear in near real time in the recipient’s system.

How do you keep track of who is sharing and accessing shared data?

Data providers can use Databricks audit logging to monitor the creation and modification of shares and recipients, and can monitor recipient activity on shares. See Audit and monitor data sharing using Delta Sharing (for providers).

Data recipients who use shared data in a Databricks account can use Databricks audit logging to understand who is accessing which data. See Audit and monitor data access using Delta Sharing (for recipients).

Limitations

  • Only tables stored in a Unity Catalog metastore can be shared using Delta Sharing.

  • Only tables in Delta format are supported. You can easily convert Parquet tables to Delta—and back again. See CONVERT TO DELTA.

  • Sharing views is not supported in this release.