Read data shared using Databricks-to-Databricks Delta Sharing

This article describes how to read data that has been shared with you using the Databricks-to-Databricks Delta Sharing protocol, in which Databricks manages a secure connection for data sharing. Unlike the Delta Sharing open sharing protocol, the Databricks-to-Databricks protocol does not require a credential file (token-based security).

Note

If data has been shared with you using the Delta Sharing open sharing protocol, see Read data shared using Databricks-to-Databricks Delta Sharing.

How do I make shared data available to my team?

To read shared data that has been shared with you using the Databricks-to-Databricks protocol, you must be a user on a Databricks workspace that is enabled for Unity Catalog. A member of your team provides the data provider with a unique identifier for your Databricks workspace, and the data provider uses that identifier to create a secure sharing connection with your organization. The shared data then becomes available for read access in your workspace, and any updates that the data provider makes to the shared tables and partitions are updated in your workspace in near real time.

Note

Updates to shared data tables appear in your workspace in near real time. However, column changes (adding, renaming, deleting) may not display in Data Explorer for one minute. Likewise, new shares and updates to shares (such as adding new tables to a share) are cached for one minute before they are available for you to view and query.

To read data that has been shared with you:

  1. You or another user on your team finds the share—the container for the tables that have been shared with you—and use that share to create a catalog—the top-level container for all data in Databricks Unity Catalog.

  2. You or another user on your team grants or denies access to the catalog and the objects inside the catalog (schemas and tables) to other members of your team.

  3. You read the data in the tables that you have been granted access to just like any other table in Databricks that you have read-only (SELECT) access to.

View providers and shares

To start reading the data that has been shared with you by a data provider, you need to know the name of the provider and share objects that are stored in your Unity Catalog metastore once the provider has shared data with you.

The provider object represents the Unity Catalog metastore, cloud platform, and region of the organization that shared the data with you.

The share object represents the tables that the provider has shared with you.

View all providers who have shared data with you

To view a list of available data providers, you can use Data Explorer, the Databricks Unity Catalog CLI, or the SHOW PROVIDERS SQL command in a Databricks notebook or the Databricks SQL query editor.

For details, see View providers.

View provider details

To view details about a provider, you can use Data Explorer, the Databricks Unity Catalog CLI, or the DESCRIBE PROVIDER SQL command in a Databricks notebook or the Databricks SQL query editor.

For details, see View provider details.

View shares

To view the shares that a provider has shared with you, you can use Data Explorer, the Databricks Unity Catalog CLI, or the SHOW SHARES IN PROVIDER SQL command in a Databricks notebook or the Databricks SQL query editor.

For details, see View shares that a provider has shared with you.

Access data in a shared table

To read data in a shared table:

  1. A privileged user must create a catalog from the share that contains the table. This can be a metastore admin or a user who has both the CREATE_CATALOG privilege for your Unity Catalog metastore and ownership of the provider object.

  2. That user or a user with the same privileges must grant you access to the shared table.

  3. You can access the table just as you would any other table registered in your Unity Catalog metastore.

Create a catalog from a share

To access the data in a share, you must create a catalog from the share. To create a catalog from a share, you can use Data Explorer, the Databricks Unity Catalog CLI, or SQL commands in a Databricks notebook or the Databricks SQL query editor.

Permissions required: Metastore admin or a user who has both the CREATE_CATALOG privilege for your Unity Catalog metastore and ownership of the provider object.

  1. In your Databricks workspace, click Data Icon Data.

  2. In the left pane, expand the Delta Sharing menu and select Shared with me.

  3. On the Providers tab, select the provider.

  4. On the Shares tab, find the share and click Create catalog on the share row.

  5. Enter a name for the catalog and optional comment.

  6. Click Create.

Run the following command in a notebook or the Databricks SQL query editor.

CREATE CATALOG [IF NOT EXISTS] <catalog-name>
USING SHARE <provider_name>.<share_name>;
databricks unity-catalog catalogs create --name <catalog_name> /
                                    --provider <provider_name> /
                                    --share <share_name>

The catalog created from a share has a catalog type of Delta Sharing. You can view the type on the catalog details page in Data Explorer or by running the DESCRIBE CATALOG SQL command in a notebook or Databricks SQL query.

A Delta Sharing catalog can be managed in the same way as regular catalogs on a Unity Catalog metastore. You can view, update, and delete a Delta Sharing catalog using Data Explorer, the Databricks CLI, and by using SHOW CATALOGS, DESCRIBE CATALOG, ALTER CATALOG, and DROP CATALOG SQL commands.

The 3-level namespace structure under a Delta Sharing catalog created from a share is the same as the one under a regular catalog on Unity Catalog: catalog.schema.table.

Table data under a shared catalog is read-only, which means you can perform read operations like DESCRIBE, SHOW, and SELECT.

Manage permissions for the schemas and tables in a Delta Sharing catalog

By default, the catalog creator is the owner of all data objects under a Delta Sharing catalog and can manage permissions for any of them.

Privileges are inherited downward, although some workspaces may still be on the legacy security model that did not provide inheritance. See Inheritance model. Any user granted the SELECT privilege on the catalog will have the SELECT privilege on all of the schemas and tables in the catalog unless that privilege is revoked. You cannot grant privileges that give write or update access to a Delta Sharing catalog or objects in a Delta Sharing catalog.

The catalog owner can delegate the ownership of data objects to other users or groups, thereby granting those users the ability to manage the object permissions and life cycles.

For detailed information about managing privileges on data objects using Unity Catalog, see Manage privileges in Unity Catalog.

Read data in a shared table

You can read data in a shared table using any of the tools available to you as a Databricks user: Data Explorer, notebooks, SQL queries, the Databricks CLI, and Databricks REST APIs.

Query a table’s change data feed

If a change data feed (CDF) is shared with a table, you can query the table data as of a version. Querying as of a timestamp is not supported.

For example:

SELECT * FROM vaccine.vaccine_us.vaccine_us_distribution VERSION AS OF 3;

To query the CDF of a shared table, both version and timestamp are supported:

SELECT * FROM table_changes('vaccine.vaccine_us.vaccine_us_distribution', 0, 3);
SELECT * FROM table_changes('vaccine.vaccine_us.vaccine_us_distribution', "2022-01-01 00:00:00", "2022-02-01 00:00:00");

For more information about change data feed, see Use Delta Lake change data feed on Databricks.