Access data shared with you using Delta Sharing
This article explains how to access data that has been shared with you using Delta Sharing.
Delta Sharing is an open standard for secure data sharing. A Databricks user, referred to as a data provider in this context, can use Delta Sharing on Databricks to share data with a person or group outside of their organization, called a data recipient. Your SAP Databricks workspace can be a data recipient but cannot be a data provider for any entities outside of your SAP BDC account.
When data is shared with you, it becomes available for read access in your workspace, and any updates that the data provider makes to the shared tables, views, volumes, and partitions are reflected in your workspace in near real time.
Updates to shared data tables, views, and volumes appear in your workspace in near real time. However, column changes (adding, renaming, deleting) may not appear in Catalog Explorer for up to one minute. Likewise, new shares and updates to shares (such as adding new tables to a share) are cached for one minute before they are available for you to view and query.
Permissions required
To be able to list and view details about all providers and provider shares, you must be a metastore admin or have the USE PROVIDER
privilege. Other users have access only to the providers and shares that they own.
To create a catalog from a provider share, you must be a metastore admin or a user who has both the CREATE CATALOG
and USE PROVIDER
privileges for your Unity Catalog metastore.
The ability to grant read-only access to the schemas (databases), tables, views, and volumes in the catalog created from the share follows the typical Unity Catalog privilege hierarchy. The ability to view notebooks in the catalog created from the share requires the USE CATALOG
privilege on the catalog.
Databricks-to-Databricks sharing and open sharing
How you access the data depends whether your data provider configured the data being shared with you for Databricks-to-Databricks sharing or open sharing.
In the Databricks-to-Databricks model, a member of your team provides the data provider with a unique identifier for your Unity Catalog metastore, and the data provider uses that to create a secure sharing connection. The shared data becomes available for access in your workspace. If necessary, a member of your team configures granular access control on that data.
In the open sharing model, you can use any tool you like (including SAP Databricks) to access the shared data. The data provider sends you an activation URL or a portal link over a secure channel. You follow it to download a credential file or URL that lets you access the data shared with you.
The shared data is not provided by Databricks directly but by data providers running on Databricks.
Databricks may collect information about data recipients' use of and access to the shared data (including identifying any individual or company who accesses the data using the credential file in connection with such information) and may share it with the applicable data provider.
Get access in the Databricks-to-Databricks model
In the Databricks-to-Databricks model:
-
Send the data provider the sharing identifier for the Unity Catalog metastore associated with your Databricks workspace.
The sharing identifier is a string consisting of the metastore's cloud, region, and UUID (the unique identifier for the metastore), in the format
<cloud>:<region>:<uuid>
. For example,aws:eu-west-1:b0c978c8-3e68-4cdf-94af-d05c120ed1ef
.To get the sharing identifier using Catalog Explorer:
-
In your SAP Databricks workspace, click
Catalog.
-
At the top of the Catalog pane, click the
gear icon and select Delta Sharing.
Alternatively, from the Quick access page, click the Delta Sharing > button.
-
On the Shared with me tab, click your Databricks sharing organization name in the upper right, and select Copy sharing identifier.
To get the sharing identifier using a notebook or Databricks SQL query, use the default SQL function
CURRENT_METASTORE
. If you use a notebook, it must run in the workspace you will use to access the shared data.SQLSELECT CURRENT_METASTORE();
-
-
The data provider creates:
- A recipient in their Databricks account to represent you and the users in your organization who will access the data.
- A share, which is a representation of the tables, volumes, and views to be shared with you.
-
You access the data shared with you. You or someone on your team can, if necessary, configure granular data access on that data for your users.
Get access in the open sharing model
In the open sharing model:
-
The data provider creates:
- A recipient in their Databricks account to represent you and the users in your organization who will access the data.
- A share, which is a representation of the tables and partitions to be shared with you.
-
The data provider sends you either an activation URL (over a secure channel) or a portal URL. You follow it to download a credential file or a URL that lets you access the data shared with you.
Both bearer tokens and OAuth Client Credentials are supported.
importantDon't share the activation link with anyone. You can download a credential file only once. If you visit the activation link again after the credential file has already downloaded, the Download Credential File button is disabled.
If you lose the activation link before you use it, contact the data provider.
-
Store the credential file in a secure location.
Don't share the credential file with anyone outside the group of users who should have access to the shared data. If you need to share it with someone in your organization, Databricks recommends using a password manager.
Create a catalog from a share
To make the data in a share accessible to your team, you must create a catalog from the share. After the share has been added to a catalog, you can grant or deny access to the catalog and the objects inside the catalog (schemas, tables, views, and volumes) to other members of your team.
Permissions required: A metastore admin, a user who has both the CREATE CATALOG
and USE PROVIDER
privileges for your Unity Catalog metastore, or a user who has the CREATE CATALOG
privilege.
If the share includes views, you must use a catalog name that is different from the name of the catalog that contains the view in the provider's metastore.
-
In your SAP Databricks workspace, click
Catalog to open Catalog Explorer.
-
At the top of the Catalog pane, click the
gear icon and select Delta Sharing.
Alternatively, from the Quick access page, click the Delta Sharing > button.
-
On the Shared with me tab, find and select the provider.
-
On the Shares tab, find the share and click Create catalog on the share row.
-
Enter a name for the catalog and optional comment.
-
Click Create.
Alternatively, when you open Catalog Explorer, you can click Create Catalog in the upper right to create a shared catalog.