Update (Delta Lake on Databricks)
UPDATE [db_name.]table_name [AS alias] SET col1 = value1 [, col2 = value2 ...] [WHERE predicate]
Update the column values for the rows that match a predicate. When no predicate is provided, update the column values for all rows.
Note
(Delta Lake on Databricks) If a column has a NOT NULL
constraint, and an INSERT INTO
statement sets a column value to NULL
, a SparkException
is thrown.
WHERE
Filter rows by predicate.
Example
UPDATE events SET eventType = 'click' WHERE eventType = 'clk'
UPDATE
supports subqueries in the WHERE
predicate, including IN
, NOT IN
, EXISTS
, NOT EXISTS
, and scalar subqueries.
Subquery Examples
UPDATE all_events
SET session_time = 0, ignored = true
WHERE session_time < (SELECT min(session_time) FROM good_events)
UPDATE orders AS t1
SET order_status = 'returned'
WHERE EXISTS (SELECT oid FROM returned_orders WHERE t1.oid = oid)
UPDATE events
SET category = 'undefined'
WHERE category NOT IN (SELECT category FROM events2 WHERE date > '2001-01-01')
Note
The following types of subqueries are not supported:
Nested subqueries, that is, a subquery inside another subquery
A
NOT IN
subquery inside anOR
, for example,a = 3 OR b NOT IN (SELECT c from t)
In most cases, you can rewrite NOT IN
subqueries using NOT EXISTS
. We recommend using
NOT EXISTS
whenever possible, as UPDATE
with NOT IN
subqueries can be slow.