Skip to main content

Upload files to a Unity Catalog volume

You can upload files in any format to a volume, including structured, semi-structured, and unstructured data. Files uploaded through the Databricks UI can't exceed 5 GB per file. To upload files larger than 5 GB, use the Databricks SDK for Python. This page provides an overview of all supported methods to upload files to a volume: the Databricks UI, the Databricks SDK, and the Databricks CLI.

For more details about uploading and managing files in volumes, see Work with files in Unity Catalog volumes.

Prerequisites

Before you upload to a volume, make sure you have the following:

  • A workspace with Unity Catalog enabled
  • WRITE VOLUME on the target volume
  • USE SCHEMA on the parent schema
  • USE CATALOG on the parent catalog

Upload using the Databricks UI

Follow these steps to upload files to a volume using the Databricks UI:

  1. In the sidebar, click New, then Add or upload data.
  2. Click Upload files to a volume.
  3. Under Files, click browse or drag and drop files into the drop zone.
  4. Under Destination volume, select a volume or directory, or paste a volume path.

If no volume exists in the target schema, you can create one by clicking Create volume. You can also create a new directory within the target volume.

Uploading a file to a volume using the UI

Upload using the Databricks SDK

The following code snippets show how to upload files using the Databricks SDK for Python:

Python
# --- Uploading a file to a volume ---
# Upload method 1 (recommended when your data is in a local file path)
w.files.upload_from(volume_file_path, upload_file_path, overwrite=True)


# Upload method 2 (recommended when your data is in-memory or not a local file)
with open(upload_file_path, "rb") as f:
w.files.upload(volume_file_path, io.BytesIO(f.read()), overwrite=True)

Upload using the Databricks CLI

The following example uploads a file named squirrels.csv from a local filesystem path to a directory named squirrel-data in a volume named my-volume. If the file already exists in the destination, it is overwritten.

databricks fs cp /Users/<username>/squirrels.csv
/Volumes/<catalog>/<schema>/my-volume/squirrel-data --overwrite