Default storage in Databricks
This page explains how default storage on Databricks works and how to create catalogs and data objects that use it.
What is default storage?
Default storage is a fully managed object storage platform that provides ready-to-use storage in your Databricks account. Some Databricks features use default storage as an alternative to external storage.
Serverless workspaces use default storage for internal and workspace storage, and for the default catalog that gets created with the workspace. In serverless workspaces, you can create additional catalogs in either default storage or in your own cloud object storage.
In both classic workspaces and serverless workspaces, default storage is used by features to store things like control plane metadata, derived data, models, and other artifacts. For example, Clean Rooms, Data Classification, and Anomaly detection all use a workspace's default storage. Refer to the individual feature documentation for details on what each feature stores on default storage.
Requirements
- Creating catalogs on default storage is only available in serverless workspaces (Public Preview).
- By default, catalogs that use default storage are only accessible from the workspace where they are created. You can grant other workspaces access, including classic workspaces, but they must use serverless compute to access data in the catalog. See Limit catalog access to specific workspaces.
- You must have
CREATE CATALOGprivileges to create a catalog with default storage. See Unity Catalog privileges and securable objects. - If your client is using the Databricks ODBC driver to access a default storage catalog from behind a firewall, you must configure your firewall to allow access to Databricks regional storage gateways. For IP and domain name details for default storage, see IP addresses and domains for Databricks services and assets.
Create a new catalog
Complete the following steps to create a new catalog using default storage:
- Click
Catalog in the sidebar. Catalog Explorer appears.
- Click Create catalog. The Create a new catalog dialog appears.
- Provide a Catalog name that is unique in your account.
- Select the option to Use default storage.
- Click Create.
In serverless workspaces, you can also use the following SQL command to create a new catalog in your default storage. You do not need to specify a location for the catalog.
CREATE CATALOG [ IF NOT EXISTS ] catalog_name
[ COMMENT comment ]
Work with default storage
All interactions with default storage require serverless, Unity Catalog-enabled compute.
Resources backed by default storage use the same privilege model as other objects in Unity Catalog. You must have sufficient privileges to create, view, query, or modify data objects. See Unity Catalog privileges and securable objects.
You work with default storage by creating and interacting with managed tables and managed volumes backed by default storage. See Unity Catalog managed tables in Databricks for Delta Lake and Apache Iceberg and What are Unity Catalog volumes?.
You can use Catalog Explorer, notebooks, the SQL editor, and dashboards to interact with data objects stored in default storage.
Example tasks
The following are examples of tasks you can complete with default storage:
- Upload local files to a managed volume or to create a managed table. See Upload files to a Unity Catalog volume and Create or modify a table using file upload.
- Query data with notebooks. See Tutorial: Query and visualize data from a notebook.
- Create a dashboard. See Create a dashboard.
- Query data with SQL and schedule SQL queries. See Write queries and explore data in the new SQL editor.
- Ingest data from an external volume to a managed table. See Using Auto Loader with Unity Catalog.
- Ingest data to a managed table with Fivetran. See Connect to Fivetran.
- Use BI tools to explore managed tables. See Connect Tableau and Databricks and Power BI with Databricks.
- Run serverless notebooks. See Serverless compute for notebooks.
- Run serverless jobs. See Run your Lakeflow Jobs with serverless compute for workflows.
- Run model serving endpoints. See Deploy models using Mosaic AI Model Serving.
- Run serverless Lakeflow Spark Declarative Pipelines. See Configure a serverless pipeline.
- Use predictive optimization on your tables. See Predictive optimization for Unity Catalog managed tables.
Limitations
The following limitations apply:
- Classic compute (any compute that is not serverless) cannot interact with data assets in default storage.
- Delta Sharing supports sharing tables to any recipient—open or Databricks—and recipients can use classic compute to access shared tables (Beta). Enable the Delta Sharing for Default Storage – Expanded Access feature in your account console.
- All other shareable assets can only be Delta shared with Databricks recipients on the same cloud. Recipients must use serverless compute.
- Tables with partitioning enabled cannot be Delta shared.
- External Iceberg and Delta clients cannot directly access the underlying metadata, manifest list, and data files for UC tables on default storage (FileIO access is not supported). However, BI tools such as Power BI and Tableau can access Unity Catalog tables on default storage using ODBC and JDBC drivers. External clients can also access Unity Catalog volumes on default storage using the Files API.
- Default storage supports external access via Databricks ODBC and JDBC drivers, including the ODBC driver's Cloud Fetch performance optimization for queries over larger datasets. However, if you are accessing a default storage table from a workspace that has front-end Private Service Connect enabled, your ODBC client queries larger than 100 MB will fail because the Cloud Fetch optimization for default storage tables does not currently support front-end Private Service Connect.