External tables (Databricks SQL)

Unity Catalog and the built-in Databricks Hive metastore use default locations for managed tables. Unity Catalog introduces several new securable objects to grant privileges to data in cloud object storage.

External table

An external table is a table that references an external storage path by using a LOCATION clause.

The storage path should be contained in an existing external location to which you have been granted access.

Alternatively you can reference a storage credential to which you have been granted access.

Using external tables abstracts away the storage path, external location, and storage credential for users whom are granted access to the external table.

Warning

To avoid accidental data loss, do not register a schema (database) to a location with existing data or create new external tables in a location managed by a schema. Dropping a schema will recursively delete all data files in the managed location.

Graphical Representation of relationships

The following diagram describes the relationship between:

  • storage credentials

  • external locations

  • external tables

  • storage paths

  • IAM entities

  • Azure service accounts

External location ER diagram

Examples

-- `finance` can create an external table over specific object within the `finance_loc` location
> CREATE TABLE sec_filings LOCATION 's3://depts/finance/sec_filings`;

-- Cannot list files under an external table without permissions on it
> LIST `s3://depts/finance/sec_filings`
  Error
> LIST `s3://depts/finance/sec_filings/_delta_log`
  Error

-- Grant access to sec_filings to all employees
> GRANT SELECT ON TABLE sec_filings TO employee;

-- Any member of the `employee` group can securely read sec_filings
> SELECT count(1) FROM sec_filings;
 20

-- Any member of the `employee` group can list files under the sec_filings table
> LIST `s3://depts/finance/sec_filings`
  _delta_log
> LIST `s3://depts/finance/sec_filings/_delta_log`
  00000.json