Work with external tables

External tables store data in a directory in cloud object storage in your cloud tenant. You must specify a storage location when you define an external table.

Databricks recommends using external tables only when you require direct access to the data without using compute on Databricks. Unity Catalog privileges are not enforced when users access data files from external systems.

Note

This article focuses on Unity Catalog external tables. External tables in the legacy Hive metastore have different behaviors. See Database objects in the legacy Hive metastore.

Work with external tables

Databricks only manages the metadata for external tables and does not use the manage storage location associated with the containing schema. The table registration in Unity Catalog is just a pointer to data files. When you drop an external table, the data files are not deleted.

When you create an external table, you can either register an existing directory of data files as a table or provide a path to create new data files.

External tables can use the following file formats:

  • DELTA

  • CSV

  • JSON

  • AVRO

  • PARQUET

  • ORC

  • TEXT

Create an external table

To create an external table, can use SQL commands or Dataframe write operations.

Before you begin

To create an external table, you must meet the following permission requirements:

  • The CREATE EXTERNAL TABLE privilege on an external location that grants access to the LOCATION accessed by the external table.

  • The USE SCHEMA permission on the table’s parent schema.

  • The USE CATALOG permission on the table’s parent catalog.

  • The CREATE TABLE permission on the table’s parent schema.

For more information about configuring external locations, see Create an external location to connect cloud storage to Databricks.

Note

When you grant access to an external table, be aware of following: Databricks recommends that you grant write privileges on a table that is backed by an external location in S3 only if the external location is defined in a single metastore. You can safely read data in a single external S3 location from more than one metastore, but concurrent writes to the same S3 location from multiple metastores can lead to consistency issues.

SQL command examples

Use one of the following command examples in a notebook or the SQL query editor to create an external table.

In the following examples, replace the placeholder values:

  • <catalog>: The name of the catalog that will contain the table.

  • <schema>: The name of the schema that will contain the table.

  • <table-name>: A name for the table.

  • <column-specification>: The name and data type for each column.

  • <bucket-path>: The path to the cloud storage bucket where the table will be created.

  • <table-directory>: A directory where the table will be created. Use a unique directory for each table.

CREATE TABLE <catalog>.<schema>.<table-name>
(
  <column-specification>
)
LOCATION 's3://<bucket-path>/<table-directory>';

For more information about table creation parameters, see CREATE TABLE.

Dataframe write operations

Many users create external tables from query results or DataFrame write operations. The following articles demonstrate some of the many patterns you can use to create an external table on Databricks:

Drop an external table

To drop a table you must be its owner or have the MANAGE privilege on the table. To drop an external table, run the following SQL command:

DROP TABLE IF EXISTS catalog_name.schema_name.table_name;

Unity Catalog does not delete the underlying data in cloud storage when you drop an external table. You must directly delete the underlying data files if you need to remove data associated with the table.

Example notebook: Create external tables

You can use the following example notebook to create a catalog, schema, and external table, and to manage permissions on them.

Create and manage an external table in Unity Catalog notebook

Open notebook in new tab