MERGE INTO

Applies to: Databricks SQL, Databricks Runtime

Merges a set of updates, insertions, and deletions based on a source table into a target Delta table.

This statement is supported only for Delta Lake tables.

Syntax

MERGE INTO target_table_name [target_alias]
   USING source_table_reference [source_alias]
   ON merge_condition
   { WHEN MATCHED [ AND matched_condition ] THEN matched_action |
     WHEN NOT MATCHED [ AND not_matched_condition ] THEN not_matched_action } [...]

matched_action
 { DELETE |
   UPDATE SET * |
   UPDATE SET { column = [ expr | DEFAULT ] } [, ...] }

not_matched_action
 { INSERT * |
   INSERT (column1 [, ...] ) VALUES ( expr | DEFAULT ] [, ...] )

Parameters

  • target_table_name

    A Table name identifying the table being modified. The table referenced must be a Delta table.

  • target_alias

    A Table aliasfor the target table. The alias must not include a column list.

  • source_table_reference

    A Table name identifying the source table to be merged into the target table.

  • source_alias

    A Table alias for the source table. The alias must not include a column list.

  • ON merge_condition

    How the rows from one relation are combined with the rows of another relation. An expression with a return type of BOOLEAN.

  • WHEN MATCHED [ AND matched_condition ]

    WHEN MATCHED clauses are executed when a source row matches a target table row based on the merge_condition and the optional match_condition.

  • matched_action

    • DELETE

      Deletes the matching target table row.

      Multiple matches are allowed when matches are unconditionally deleted. An unconditional delete is not ambiguous, even if there are multiple matches.

    • UPDATE

      Updates the matched target table row.

      To update all the columns of the target Delta table with the corresponding columns of the source dataset, use UPDATE SET *. This is equivalent to UPDATE SET col1 = source.col1 [, col2 = source.col2 ...] for all the columns of the target Delta table. Therefore, this action assumes that the source table has the same columns as those in the target table, otherwise the query will throw an analysis error.

      Note

      This behavior changes when automatic schema migration is enabled. See Automatic schema evolution for Delta Lake merge for details.

      Applies to: check marked yes Databricks SQL SQL warehouse version 2022.35 or higher check marked yes Databricks Runtime 11.2 and above

      You can specify DEFAULT as expr to explicitly update the column to its default value.

    If there are multiple WHEN MATCHED clauses, then they are evaluated in the order they are specified. Each WHEN MATCHED clause, except the last one, must have a matched_condition.

    If none of the WHEN MATCHED conditions evaluate to true for a source and target row pair that matches the merge_condition, then the target row is left unchanged.

  • WHEN NOT MATCHED [ AND not_matched_condition ]

    WHEN NOT MATCHED clauses insert a row when a source row does not match any target row based on the merge_condition and the optional not_matched_condition.

    not_matched_condition must be a Boolean expression.

    • INSERT *

      Inserts all the columns of the target Delta table with the corresponding columns of the source dataset. This is equivalent to INSERT (col1 [, col2 ...]) VALUES (source.col1 [, source.col2 ...]) for all the columns of the target Delta table. This action requires that the source table has the same columns as those in the target table.

      Note

      This behavior changes when automatic schema migration is enabled. See Automatic schema evolution for Delta Lake merge for details.

    • INSERT ( ... ) VALUES ( ... )

      The new row is generated based on the specified column and corresponding expressions. All the columns in the target table do not need to be specified. For unspecified target columns, the column default is inserted, or NULL if none exists.

      Applies to: check marked yes Databricks SQL SQL warehouse version 2022.35 or higher check marked yes Databricks Runtime 11.2 and above

      You can specify DEFAULT as an expression to explicitly insert the column default for a target column.

    If there are multiple WHEN NOT MATCHED clauses, then they are evaluated in the order they are specified. All WHEN NOT MATCHED clauses, except the last one, must have not_matched_conditions.

Important

A MERGE operation can fail if multiple rows of the source dataset match and attempt to update the same rows of the target Delta table. According to the SQL semantics of merge, such an update operation is ambiguous as it is unclear which source row should be used to update the matched target row. You can preprocess the source table to eliminate the possibility of multiple matches. See the Change data capture example—it preprocesses the change dataset (that is, the source dataset) to retain only the latest change for each key before applying that change into the target Delta table.

Examples

You can use MERGE INTO for complex operations like deduplicating data, upserting change data, applying SCD Type 2 operations, etc. See Upsert into a Delta Lake table using merge for a few examples.