Trigger jobs when new files arrive
Preview
File arrival triggers are in Public Preview.
You can use file arrival triggers to trigger a run of your Databricks job when new files arrive in an external location such as Amazon S3 or Azure storage. You can use this feature when a scheduled job might be inefficient because new data arrives on an irregular schedule.
File arrival triggers check for new files every minute, and do not incur any additional cost other than cloud provider costs associated with listing files in the storage location.
Requirements
The following are required to use file arrival triggers:
The workspace must have Unity Catalog enabled.
You must use an external location added to the Unity Catalog metastore. See Manage external locations and storage credentials.
You must have
READ
permissions to the external location andCan Manage
permissions on the job. For more information about job permissions, see Jobs access control.
Limitations
File arrival triggers work only for external locations with up to 10,000 files. Locations with more files cannot be monitored for new file arrivals.
The path used for a file arrival trigger must not contain any external tables or managed locations of catalogs and schemas.
Add a file arrival trigger
To add a file arrival trigger to a job:
In the sidebar, click Workflows.
In the Name column on the Jobs tab, click the job name.
In the Job details panel on the right, click Add trigger.
In Trigger type, select File arrival.
In Storage location, enter the URL of the external location or a subdirectory of the external location to monitor.
(Optional) Configure advanced options:
Minimum time between triggers in seconds: The minimum time to wait to trigger a run after a previous run completes. Files that arrive in this period trigger a run only after the waiting time expires. Use this setting to control the frequency of run creation.
Wait after last change in seconds: The time to wait to trigger a run after file arrival. Another file arrival within this period resets the timer. This setting can be used when files arrive in batches, and the whole batch needs to be processed after all files have arrived.
To validate the configuration, click Test connection.
Click Save.
Receive notifications of failed file arrival triggers
To be notified if a file arrival trigger fails to evaluate, configure email or system destination notifications on job failure. See Add email and system notifications for job events.