Manage files in Unity Catalog volumes with the Databricks ODBC Driver (Simba)
This page describes how to upload, download, and delete files in Unity Catalog volumes using the Databricks ODBC Driver.
Requirements
- Databricks ODBC Driver version 2.8.2 or above
- Native query mode enabled (default). If disabled, add
UseNativeQuery=1orUseNativeQuery=2to your connection string.
For a complete Python example with authentication setup, see Connect Python and pyodbc to Databricks.
Upload a file
To upload a file, add the StagingAllowedLocalPaths property to your connection string with the path of the file to upload. For multiple source locations, use a comma-separated list (for example, /tmp/,/usr/tmp/).
In multi-tenant environments where users control the ODBC URL (such as BI tools or developer services), set StagingAllowedLocalPaths to a sandbox location or non-existent path. This prevents users from writing arbitrary files and interfering with the service's internal deployment.
To overwrite an existing file, add OVERWRITE to the statement.
conn_string = "".join([
"DRIVER=", os.getenv("ODBC_DRIVER", "/Library/simba/spark/lib/libsparkodbc_sbu.dylib"),
";Host=", os.getenv("ODBC_HOST_NAME", "<<HOST_NAME>>"),
";PORT=443",
";HTTPPath=", os.getenv("ODBC_HTTP_PATH", "/sql/1.0/endpoints/1234567890"),
";AuthMech=11",
";SSL=1",
";ThriftTransport=2",
";SparkServerType=3",
";Auth_Flow=0",
";Auth_AccessToken=", os.getenv("API_TOKEN", "<<NO_ACCESS_TOKEN_IS_SET>>"),
";StagingAllowedLocalPaths=", "/tmp"),
os.getenv("ODBC_OPTIONS", ""),
])
conn = pyodbc.connect(conn_string, autocommit=True)
cursor = conn.cursor()
cursor.execute("PUT '" +
"/tmp/my-data.csv" +
"' INTO '" +
"/Volumes/main/default/my-volume/my-data.csv" +
"' OVERWRITE")
Download a file
Use GET to download a file from a volume to a local path:
conn = pyodbc.connect(conn_string, autocommit=True)
cursor = conn.cursor()
cursor.execute("GET '" +
"/Volumes/main/default/my-volume/my-data.csv" +
"' TO '" +
"/tmp/my-downloaded-data.csv" +
"'")
Delete a file
Use REMOVE to delete a file from a volume:
conn = pyodbc.connect(conn_string, autocommit=True)
cursor = conn.cursor()
cursor.execute("REMOVE '" +
"/Volumes/main/default/my-volume/my-data.csv" +
"'")