This article describes how to read data that has been shared with you using the Databricks-to-Databricks Delta Sharing protocol, in which Databricks manages a secure connection for data sharing. Unlike the Delta Sharing open sharing protocol, the Databricks-to-Databricks protocol does not require a credential file (token-based security).
If data has been shared with you using the Delta Sharing open sharing protocol, see Read data shared using Databricks-to-Databricks Delta Sharing.
To read shared data that has been shared with you using the Databricks-to-Databricks protocol, you must be a user on a Databricks workspace that is enabled for Unity Catalog. A member of your team provides the data provider with a unique identifier for your Databricks workspace, and the data provider uses that identifier to create a secure sharing connection with your organization. The shared data then becomes available for read access in your workspace, and any updates that the data provider makes to the shared tables and partitions are updated in your workspace in near real time.
Updates to shared data tables appear in your workspace in near real time. However, column changes (adding, renaming, deleting) may not display in Data Explorer for one minute. Likewise, new shares and updates to shares (such as adding new tables to a share) are cached for one minute before they are available for you to view and query.
To read data that has been shared with you:
You or another user on your team finds the share—the container for the tables that have been shared with you—and use that share to create a catalog—the top-level container for all data in Databricks Unity Catalog.
You or another user on your team grants or denies access to the catalog and the objects inside the catalog (schemas and tables) to other members of your team.
You read the data in the tables that you have been granted access to just like any other table in Databricks that you have read-only (
SELECT) access to.
To start reading the data that has been shared with you by a data provider, you need to know the name of the provider and share objects that are stored in your Unity Catalog metastore once the provider has shared data with you.
The provider object represents the Unity Catalog metastore, cloud platform, and region of the organization that shared the data with you.
The share object represents the tables that the provider has shared with you.
To view a list of available data providers, you can use Data Explorer, the Databricks Unity Catalog CLI, or the
SHOW PROVIDERS SQL command in a Databricks notebook or the Databricks SQL query editor.
For details, see View providers.
To view details about a provider, you can use Data Explorer, the Databricks Unity Catalog CLI, or the
DESCRIBE PROVIDER SQL command in a Databricks notebook or the Databricks SQL query editor.
For details, see View provider details.
To view the shares that a provider has shared with you, you can use Data Explorer, the Databricks Unity Catalog CLI, or the
SHOW SHARES IN PROVIDER SQL command in a Databricks notebook or the Databricks SQL query editor.
For details, see View shares that a provider has shared with you.
To read data in a shared table:
A privileged user must create a catalog from the share that contains the table. This can be a metastore admin or a user who has both the
CREATE_CATALOGprivilege for your Unity Catalog metastore and ownership of the provider object.
That user or a user with the same privileges must grant you access to the shared table.
You can access the table just as you would any other table registered in your Unity Catalog metastore.
To access the data in a share, you must create a catalog from the share. To create a catalog from a share, you can use Data Explorer, the Databricks Unity Catalog CLI, or SQL commands in a Databricks notebook or the Databricks SQL query editor.
Permissions required: Metastore admin or a user who has both the
CREATE_CATALOG privilege for your Unity Catalog metastore and ownership of the provider object.
In your Databricks workspace, click Data.
In the left pane, expand the Delta Sharing menu and select Shared with me.
On the Providers tab, select the provider.
On the Shares tab, find the share and click Create catalog on the share row.
Enter a name for the catalog and optional comment.
Run the following command in a notebook or the Databricks SQL query editor.
CREATE CATALOG [IF NOT EXISTS] <catalog-name> USING SHARE <provider_name>.<share_name>;
databricks unity-catalog catalogs create --name <catalog_name> / --provider <provider_name> / --share <share_name>
The catalog created from a share has a catalog type of Delta Sharing. You can view the type on the catalog details page in Data Explorer or by running the DESCRIBE CATALOG SQL command in a notebook or Databricks SQL query.
A Delta Sharing catalog can be managed in the same way as regular catalogs on a Unity Catalog metastore. You can view, update, and delete a Delta Sharing catalog using Data Explorer, the Databricks CLI, and by using
ALTER CATALOG, and
DROP CATALOG SQL commands.
The 3-level namespace structure under a Delta Sharing catalog created from a share is the same as the one under a regular catalog on Unity Catalog:
Table data under a shared catalog is read-only, which means you can perform read operations like
By default, the catalog creator is the owner of all data objects under a Delta Sharing catalog and can manage permissions for any of them.
Privileges are inherited downward, although some workspaces may still be on the legacy security model that did not provide inheritance. See Inheritance model. Any user granted the
SELECT privilege on the catalog will have the
SELECT privilege on all of the schemas and tables in the catalog unless that privilege is revoked. You cannot grant privileges that give write or update access to a Delta Sharing catalog or objects in a Delta Sharing catalog.
The catalog owner can delegate the ownership of data objects to other users or groups, thereby granting those users the ability to manage the object permissions and life cycles.
For detailed information about managing privileges on data objects using Unity Catalog, see Manage privileges in Unity Catalog.
You can read data in a shared table using any of the tools available to you as a Databricks user: Data Explorer, notebooks, SQL queries, the Databricks CLI, and Databricks REST APIs.
If a change data feed (CDF) is shared with a table, you can query the table data as of a version. Querying as of a timestamp is not supported.
SELECT * FROM vaccine.vaccine_us.vaccine_us_distribution VERSION AS OF 3;
To query the CDF of a shared table, both version and timestamp are supported:
SELECT * FROM table_changes('vaccine.vaccine_us.vaccine_us_distribution', 0, 3); SELECT * FROM table_changes('vaccine.vaccine_us.vaccine_us_distribution', "2022-01-01 00:00:00", "2022-02-01 00:00:00");
For more information about change data feed, see Use Delta Lake change data feed on Databricks.