Skip to main content

Database objects in SAP Databricks

SAP Databricks uses two primary securable objects to store and access data.

  • Tables govern access to tabular data.
  • Volumes govern access to non-tabular data.

This article describes how these database objects relate to catalogs, schemas, views, and other database objects in SAP Databricks. This article also provides a high-level introduction to how database objects work in the context of the overall platform architecture.

What are database objects in SAP Databricks?

Database objects are entities that help you organize, access, and govern data. SAP Databricks uses a three-tier hierarchy to organize database objects:

  1. Catalog: The top level container, contains schemas.
  2. Schema or database: Contains data objects.
  3. Data objects that can be contained in a schema:
    • Volume: a logical volume of non-tabular data in cloud object storage.
    • Table: a collection of data organized by rows and columns.
    • View: a saved query against one or more tables.
    • Function: saved logic that returns a scalar value or set of rows.
    • Model: a machine learning model packaged with MLflow.

Catalogs are registered in a metastore that is managed at the account level.

SAP Databricks provides additional assets for working with data, all of which are governable using workspace-level access controls or Unity Catalog, the Databricks data governance solution:

  • Workspace-level data assets, like notebooks, jobs, and queries.
  • Unity Catalog securable objects like storage credentials and Delta Sharing shares, which primarily control access to storage or secure sharing.

Managing access to database objects using Unity Catalog

You can grant and revoke access to database objects at any level in the hierarchy, including the metastore itself. Access to an object implicitly grants the same access to all children of that object, unless access is revoked.

You can use typical ANSI SQL commands to grant and revoke access to objects in Unity Catalog. You can also use Catalog Explorer for UI-driven management of data object privileges.

Default object permissions in Unity Catalog

Users have default permissions on automatically provisioned catalogs, including the workspace catalog (<workspace-name>). This catalog contains a schema named default that is accessible to all users in the workspace.

Database objects vs. workspace securable data assets

SAP Databricks allows you to manage multiple data engineering, analytics, ML, and AI assets alongside your database objects. You do not register these data assets in Unity Catalog. Instead, these assets are managed at the workspace level, using control lists to govern permissions. These data assets include the following:

  • Notebooks
  • Workspace files
  • SQL queries
  • Experiments

Most data assets contain logic that interacts with database objects to query data, use functions, register models, or other common tasks.

Managed storage locations for managed volumes and tables

When you create tables and volumes SAP Databricks, you have the choice of making them managed or external. Unity Catalog manages access to external tables and volumes from SAP Databricks but doesn't control underlying files or fully manage the storage location of those files. Managed tables and volumes, on the other hand, are fully managed by Unity Catalog and are stored in a managed storage location that is associated with the containing schema.

Databricks recommends managed volumes and managed tables for most workloads, because they simplify configuration, optimization, and governance.