Skip to main content

Schedule refreshes in Databricks SQL

You can manually refresh your Databricks SQL created pipeline (either materialized view or streaming table) when you know that the source tables have been updated. However, you can also set a schedule for refreshes, either by time, when source tables update, or via orchestration. This page describes how to create the schedules for your pipelines.

You can also create notifications and set the performance mode for your scheduled refreshes.

Create a schedule

You can configure a Databricks SQL pipeline to refresh automatically based on a defined schedule, or to trigger when upstream data is changed. The following table shows the different options for scheduling refreshes.

Method

Description

Example use case

Manual

On-demand refresh using a SQL REFRESH statement, or through the workspace UI.

Development, testing, ad-hoc updates.

TRIGGER ON UPDATE

Schedule the pipeline to automatically refresh when the upstream data changes.

Production workloads with data freshness SLAs or unpredictable refresh periods.

SCHEDULE

Schedule the pipeline to refresh at defined time intervals.

Predictable, time-based refresh requirements.

SQL task in a job

Refresh is orchestrated through Lakeflow Jobs.

Complex pipelines with cross-system dependencies.

Even when you schedule refreshes, you can run a manual refresh at any time if you need updated data.

Manual refresh

To manually refresh a pipeline, you can call a refresh from Databricks SQL, or use the workspace UI.

To refresh a pipeline using Databricks SQL:

  1. In the Query editor icon. SQL Editor, run the following statement:

    SQL
    REFRESH MATERIALIZED VIEW <table-name>;

    For streaming tables, use REFRESH STREAMING TABLE.

For more information, see REFRESH (MATERIALIZED VIEW or STREAMING TABLE).

Trigger on update

The TRIGGER ON UPDATE clause automatically refreshes a pipeline when upstream source data changes. This eliminates the need to coordinate schedules across pipelines. The dataset stays fresh without requiring the user to know when upstream jobs finish or maintain complex scheduling logic.

This is the recommended approach for production workloads, especially when upstream dependencies don't run on predictable schedules. After configuring trigger on update, the pipeline monitors its source tables and refreshes automatically when changes in any of the upstream sources are detected.

Limitations

  • Upstream dependency limits: A pipeline can monitor a maximum of 10 upstream tables and 30 upstream views. For more dependencies, split the logic across multiple pipelines.
  • Workspace limits: A maximum of 1,000 pipelines with TRIGGER ON UPDATE can exist per workspace. Contact Databricks support if more than 1,000 are needed.
  • Minimum interval: The minimum trigger interval is 1 minute.

The following examples show how to set a trigger on update when defining a pipeline.

Create a pipeline with trigger on update

To create a pipeline that refreshes automatically when source data changes, include the TRIGGER ON UPDATE clause in the CREATE statement.

The following example creates a streaming table that reads customer orders and refreshes whenever the source orders table is updated:

SQL
CREATE OR REFRESH STREAMING TABLE catalog.schema.customer_orders
TRIGGER ON UPDATE
AS SELECT
o.customer_id,
o.name,
o.order_id
FROM catalog.schema.orders o;

Throttle refresh frequency

If upstream data refreshes frequently, use AT MOST EVERY to cap how often the view refreshes and limit compute costs. This is useful when source tables update frequently but downstream consumers don't need real-time data. The INTERVAL keyword is required before the time value.

The following example limits the streaming table to refresh at most every 5 minutes, even if source data changes more frequently:

SQL
CREATE OR REFRESH STREAMING TABLE catalog.schema.customer_orders
TRIGGER ON UPDATE AT MOST EVERY INTERVAL 5 MINUTES
AS SELECT
o.customer_id,
o.name,
o.order_id
FROM catalog.schema.orders o;

Scheduled refresh

Refresh schedules can be defined directly in the pipeline definition to refresh the view at fixed time intervals. This approach is useful when the data update cadence is known and predictable refresh timing is desired.

When there is a refresh schedule, you can still run a manual refresh at any time if you want updated data.

Databricks supports two scheduling syntaxes: SCHEDULE EVERY for simple intervals and SCHEDULE CRON for precise scheduling. The SCHEDULE and SCHEDULE REFRESH keywords are semantically equivalent.

For details about the syntax and use of the SCHEDULE clause, see CREATE STREAMING TABLE SCHEDULE clause, or CREATE MATERIALIZED VIEW SCHEDULE clause.

When a schedule is created, a new Databricks job is automatically configured to process the update.

To view the schedule, do one of the following:

  • Run the DESCRIBE EXTENDED statement from the SQL editor in the Databricks UI. See DESCRIBE TABLE.
  • Use Catalog Explorer to view the dataset. The schedule is listed on the Overview tab, under Refresh status. See What is Catalog Explorer?.

The following examples show how to create a materialized view with a schedule:

Schedule every time interval

This example schedules a refresh every 5 minutes:

SQL
CREATE OR REPLACE MATERIALIZED VIEW catalog.schema.hourly_metrics
SCHEDULE EVERY 1 HOUR
AS SELECT
date_trunc('hour', event_time) AS hour,
count(*) AS events
FROM catalog.schema.raw_events
GROUP BY 1;

Schedule using cron

This example schedules a refresh every 15 minutes, at the quarter hour of the UTC time zone:

SQL
CREATE OR REPLACE MATERIALIZED VIEW catalog.schema.regular_metrics
SCHEDULE CRON '0 */15 * * * ?' AT TIME ZONE 'UTC'
AS SELECT
date_trunc('minute', event_time) AS minute,
count(*) AS events
FROM catalog.schema.raw_events
WHERE event_time > current_timestamp() - INTERVAL 1 HOUR
GROUP BY 1;

SQL task in a job

Pipeline refreshes can be orchestrated through Lakeflow Jobs by creating SQL tasks that include REFRESH commands. This approach integrates pipeline refreshes into existing job orchestration workflows.

There are two ways to create a job for refreshing streaming tables:

  • From the SQL Editor: Write the REFRESH command and click the Schedule button to create a job directly from the query.
  • From the Jobs UI: Create a new job, add a SQL task type, and attach a SQL Query or notebook with the REFRESHBLE command.

The following example shows the SQL statement within a SQL task that refreshes a streaming table:

SQL
REFRESH STREAMING TABLE catalog.schema.sales;

This approach is appropriate when:

  • Complex multi-step pipelines have dependencies across systems.
  • Integration with existing job orchestration is required.
  • Job-level alerting and monitoring is needed.

SQL tasks use both the SQL warehouse attached to the job and the serverless compute that executes the refresh. If using streaming table definition-based scheduling meets the requirements, switching to TRIGGER ON UPDATE or SCHEDULE can simplify the workflow.

Add a schedule to an existing pipeline

To set the schedule after creation, use the ALTER STREAMING TABLE or ALTER MATERIALIZED VIEW statement. For example:

SQL
-- Alters the schedule to refresh the streaming table when its upstream
-- data gets updated.
ALTER STREAMING TABLE sales
ADD TRIGGER ON UPDATE;

Modify an existing schedule or trigger

If a pipeline already has a schedule or trigger associated, use ALTER SCHEDULE or ALTER TRIGGER ON UPDATE to change the refresh configuration. This applies whether changing from one schedule to another, one trigger to another, or switching between a schedule and a trigger.

The following example changes an existing schedule to refresh every 5 minutes:

SQL
ALTER STREAMING TABLE catalog.schema.my_table
ALTER SCHEDULE EVERY 5 MINUTES;

Drop a schedule or trigger

To remove a schedule, use ALTER ... DROP:

SQL
ALTER STREAMING TABLE catalog.schema.my_table
DROP SCHEDULE;

Track the status of a refresh

You can view the status of a refresh by viewing the pipeline in the Pipelines UI or by viewing the Refresh Information returned by the DESCRIBE EXTENDED command for the dataset.

SQL
DESCRIBE TABLE EXTENDED <table-name>;

Alternately, you can view the dataset in Catalog Explorer and see the refresh status there:

  1. Click Data icon. Catalog in the sidebar.
  2. In the Catalog Explorer tree at the left, open the catalog and select the schema where your dataset is located.
  3. Open the Tables item under the schema you selected, and click the streaming table or materialized view.

From here, you can use the tabs under the dataset name to view and edit information about the dataset, including:

  • Refresh status and history
  • The table schema
  • Sample data (requires an active compute)
  • Permissions
  • Lineage, including tables and other pipelines that this dataset depends on
  • Insights into usage
  • Monitors that you have created for this dataset

Stop an active refresh

To stop an active refresh in the Databricks UI, in the Pipeline details page click Stop to stop the pipeline update. You can also stop the refresh with the Databricks CLI or the POST /api/2.0/pipelines/{pipeline_id}/stop operation in the Pipelines REST API.

View the history of runs for a scheduled refresh

If you open your dataset in Catalog Explorer, the details pane on the right side of the workspace shows the Refresh schedule. Clicking the schedule (for example, the Every 1 hour) link takes you to the job page for the (system managed) job that runs the schedule. You can see the history of runs, including a graph of the last 48 hours of runs, including success or failure and time taken. You can click a specifc run to get more details.

You can't edit this system managed job. To make changes to the schedule, edit the definition of the pipeline with CREATE OR REFRESH or with ALTER. See Modify an existing schedule or trigger.

Timeouts for refreshes

Pipeline refreshes are run with a timeout that limits how long they can run. For Databricks SQL pipelines created or updated on or after August 14, 2025, the timeout is captured when you update by running CREATE OR REFRESH:

  • If a STATEMENT_TIMEOUT is set, that value is used. See STATEMENT_TIMEOUT.
  • Otherwise, the timeout from the SQL warehouse used to run the command is used.
  • If the warehouse does not have a timeout configured, a default of 2 days applies.

The timeout is used on the initial create, but also on scheduled refreshes that follow.

For streaming tables that were last updated prior to August 14, 2025, the timeout is set to 2 days.

Example: Set a timeout for a refresh You can explicitly control how long a refresh is allowed to run by setting a statement-level timeout when creating or updating the dataset:

SQL
SET STATEMENT_TIMEOUT = '6h';

CREATE OR REFRESH MATERIALIZED VIEW my_catalog.my_schema.my_mv
SCHEDULE EVERY 12 HOURS
AS SELECT * FROM large_source_table;

This sets up the materialized view to be refreshed every 12 hours, and if a refresh takes more than 6 hours, it times out and waits for the next scheduled refresh.

How scheduled refreshes handle timeouts

Timeouts are synchronized only when you explicitly run CREATE OR REFRESH.

  • Scheduled refreshes continue using the timeout captured during the most recent CREATE OR REFRESH.
  • Changing the warehouse timeout alone does not affect existing scheduled refreshes.
important

After changing a warehouse timeout, run CREATE OR REFRESH again to apply the new timeout to future scheduled refreshes.

Get notifications for scheduled refreshes

Beta

The notifications for DDL scheduled refreshes feature is in Beta. Workspace admins can control access to this feature from the Previews page by opting into the System-Managed Job for Materialized Views & Streaming Tables preview. See Manage Databricks previews.

When you create a schedule for your pipeline, you can edit it to get notifications. There are multiple ways to schedule pipelines, and getting notifications depends on which of these methods you choose:

  • Scheduled with a job: To get notifications from a SQL task in Lakeflow Jobs, edit the task and add notifications. See SQL task for jobs.

    You have a wide range of options for the notifications to received, and how to receive them. See Add notifications on a job

  • Scheduled with a SCHEDULE clause: To get notifications from a pipeline that is scheduled by a SCHEDULE clause in the SQL definition, edit it in the Catalog Explorer:

    1. Open the dataset in Catalog Explorer.

    2. On the Overview tab, under Refresh schedule, click Pencil icon. to edit the schedule you want to receive notifications for.

    3. Under More options, add or modify notifications.

      You have the option to get notified via email on start, success, or failure of the scheduled refresh. By default, the owner is notified on failure only.

      The email includes a link that takes you the run history for the system managed job that orchestrates your schedule. See View the history of runs for a scheduled refresh.

Select a performance mode for scheduled refreshes

The serverless compute used by the pipeline runs in Performance-optimized mode when run through the UI.

For pipelines scheduled in the SQL definition, you can select the serverless compute performance mode using the Performance optimized setting in Catalog Explorer. When this setting is disabled (the default), the pipeline uses standard performance mode. Standard performance mode is designed to reduce costs for workloads where a slightly higher launch latency is acceptable. Serverless workloads using standard performance mode typically start within four to six minutes after being triggered, depending on compute availability and optimized scheduling.

When Performance optimized is enabled, your pipeline is optimized for performance, resulting in faster startup and execution for time-sensitive workloads.

Both modes use the same SKU, but standard performance mode consumes fewer DBUs, reflecting lower compute usage.

Beta

Modifying the performance mode for scheduled refreshes is in Beta. Workspace admins can control access to this feature from the Previews page by opting into the System-Managed Job for Materialized Views & Streaming Tables preview. See Manage Databricks previews.

By default, pipelines use the performance-optimized mode when run interactively in the UI or when scheduled with a SQL task, and use the standard mode when scheduled. To set the compute mode for pipelines scheduled by a SCHEDULE clause in the definition, edit the schedule in the Catalog Explorer:

  1. Open the dataset in Catalog Explorer.
  2. On the Overview tab, under Refresh schedule, click Pencil icon. to edit the schedule you want to modify.
  3. Check Performance optimized to use the performance-optimized mode on future scheduled refreshes.