Refresh operations for materialized views

Materialized views are database objects that contain the results of a SQL query on one or more base tables. Some materialized views can be incrementally refreshed, automatically and incrementally propagating changes from the base tables. This article explains the refresh operations that can be applied to materialized views.

Refresh types

Refresh operations are one of these types:

  • Incremental refresh: An incremental refresh processes changes in the underlying data after the last refresh and then appends that data to the table. Depending on the base tables and included operations, only certain types of materialized views can be incrementally refreshed.

  • Full refresh: A full refresh truncates the table and reprocesses all data available in the source with the latest definition. It is not recommended to perform full refreshes on sources that don’t keep the entire data history or have short retention periods, such as Kafka, because the full refresh truncates the existing data. You may be unable to recover old data if the data is no longer available in the source.

How materialized views are refreshed

Materialized views automatically create and use Delta Live Tables pipelines to process refresh operations. Delta Live Tables pipelines use either a continuous or triggered execution mode. Materialized views can be updated in either execution mode. To avoid unnecessary processing when operating in continuous execution mode, pipelines automatically monitor dependent Delta tables and perform an update only when the contents of those dependent tables have changed. See What is a Delta Live Tables pipeline?.

Note

The Delta Live Tables runtime cannot detect changes in non-Delta data sources. The table is still updated regularly but with a higher default trigger interval to prevent excessive recomputation from slowing down any incremental processing happening on compute.

By default, refresh operations are performed synchronously. You can also set a refresh operation to occur asynchronously. The behavior associated with each approach is as follows:

  • Synchronous: A synchronous refresh blocks other operations until the refresh operation is complete. This allows you to sequence refresh operations in an orchestration tool, like workflows. To orchestrate materialized views with workflows, use the SQL task type. See Introduction to Databricks Workflows.

  • Asynchronous: An asynchronous refresh starts a background job on Delta Live Tables compute when a materialized view refresh begins, and the command returns before the data load is complete. Because a Delta Live Tables pipeline manages the refresh, the Databricks SQL warehouse used to create the materialized view is not used. It does not need to be running during the refresh operation.

Support for materialized view incremental refresh

The following table lists support for incremental refresh by SQL keyword or clause:

SQL keyword or clause

Support for incremental refresh

SELECT expressions

Expressions including deterministic built-in functions and immutable user-defined functions (UDFs) are supported.

WITH

Yes, common table expressions are supported.

FROM

Supported base tables include Delta tables, materialized views, and streaming tables

EXPECTATIONS

No. Materialized views that use expectations are always fully refreshed.

UNION ALL

No

INNER JOIN

No

LEFT JOIN

No

GROUP BY

Yes

WHERE, HAVING

Filter clauses such as WHERE and HAVING are supported.

OVER

No

QUALIFY

No

Note

Non-deterministic functions, for example, CURRENT_TIMESTAMP, are not supported.