Skip to main content

Compare Auto Loader file detection modes

Auto Loader supports two modes for detecting new files: directory listing and file notification. You can switch file discovery modes across stream restarts and still obtain exactly-once data processing guarantees.

Directory listing mode

In directory listing mode, Auto Loader identifies new files by listing the input directory. Directory listing mode allows you to quickly start Auto Loader streams without any permission configurations other than access to your data on cloud storage.

In Databricks Runtime 9.1 and above, Auto Loader can automatically detect whether files are arriving with lexical ordering to your cloud storage and significantly reduce the amount of API calls needed to detect new files. See Auto Loader streams with directory listing mode for more details.

File notification mode leverages file notification and queue services in your cloud infrastructure account. Auto Loader can automatically set up a notification service and queue service that subscribe to file events from the input directory. If you enable file events on the external location that contains the files in question, you do not need to provide additional permissions when you set up the Auto Loader stream.

File notification mode with file events is more performant and scalable than directory listing. Databricks recommends file notification mode using file events instead of directory listing mode for most workloads. If you are using Auto Loader in directory listing mode today, Databricks recommends that you migrate to file notification mode using mfile events to see significant performance improvements. See Configure Auto Loader streams in file notification mode.

Cloud storage supported by modes

This table lists supported compute for each file detection mode, by cloud storage provider.

If you migrate from an external location or a DBFS mount to a Unity Catalog volume, Auto Loader continues to provide exactly-once guarantees.

Cloud storage

Directory listing

File notifications without file events

File notifications with file events

AWS S3

All versions

All versions

Databricks Runtime 14.3 LTS and above

ADLS

All versions

All versions

Databricks Runtime 14.3 LTS and above

GCS

All versions

All versions

Databricks Runtime 14.3 LTS and above

Azure Blob Storage

All versions

All versions

Unsupported

DBFS

All versions

For mount points only

Databricks Runtime 14.3 LTS and above, if the DBFS mount point has an external location defined in Unity Catalog

Unity Catalog volume

Databricks Runtime 13.3 LTS and above

Unsupported

Databricks Runtime 14.3 LTS and above