INSERT INTO
Inserts new rows into a table and optionally truncates the table or partitions. You specify the inserted rows by value expressions or the result of a query.
Syntax
INSERT { OVERWRITE | INTO } [ TABLE ] table_name
[ PARTITION clause ]
[ ( column_name [, ...] ) ]
query
Note
When you INSERT INTO
a Delta table schema enforcement and evolution is supported. If a column’s data type cannot be safely cast to a Delta table’s data type, a runtime exception is thrown. If schema evolution is enabled, new columns can exist as the last columns of your schema (or nested columns) for the schema to evolve.
Parameters
INTO
orOVERWRITE
If you specify
OVERWRITE
the following applies:Without a
partition_spec
the table is truncated before inserting the first row.Otherwise all partitions matching the
partition_spec
are truncated before inserting the first row.
If you specify
INTO
all rows inserted are additive to the existing rows.-
Identifies the table to be inserted to. The name must not include a temporal specification.
-
An optional parameter that specifies a target partition for the insert. You may also only partially specify the partition.
( column_name [, …] )
An optional permutation of all the columns in the table. You can use this clause to map the columns if the columns returned by the
query
does not line up with the natural order of the column.-
A query that produces the rows to be inserted.
You must match the number of columns returned by the query with the columns in the table excluding partitioning columns with assigned values in the
PARTITION
clause.If a data type cannot be safely cast to the matching column data type, a runtime exception is thrown.
If schema evolution is enabled, new columns can exist as the last columns of your schema (or nested columns) for the schema to evolve.
Dynamic partition inserts
In PARTITION clause, the partition column values are optional. When the partition specification part_spec
is not completely provided, such inserts are called dynamic partition inserts or multi-partition inserts. When the values are not specified, these columns are referred to as dynamic partition columns; otherwise, they are static partition columns. For example, the partition spec (p1 = 3, p2, p3)
has a static partition column (p1
) and two dynamic partition columns (p2
and p3
).
In PARTITION clause, static partition keys must come before the dynamic partition keys. This means all partition columns having constant values must appear before other partition columns that do not have an assigned constant value.
The partition values of dynamic partition columns are determined during the execution. The dynamic partition columns must be specified last in both part_spec
and the input result set (of the row value lists or the select query). They are resolved by position, instead of by names. Thus, the orders must be exactly matched.
The DataFrameWriter APIs do not have an interface to specify partition values. Therefore, the insertInto()
API is always using dynamic partition mode.
Examples
In this section:
INSERT INTO
Single row insert using a VALUES
clause
> CREATE TABLE students (name VARCHAR(64), address VARCHAR(64), student_id INT)
PARTITIONED BY (student_id);
> INSERT INTO students VALUES
('Amy Smith', '123 Park Ave, San Jose', 111111);
> SELECT * FROM students;
name address student_id
--------- --------------------- ----------
Amy Smith 123 Park Ave,San Jose 111111
Multi-row insert using a VALUES
clause
> INSERT INTO students VALUES
('Bob Brown', '456 Taylor St, Cupertino', 222222),
('Cathy Johnson', '789 Race Ave, Palo Alto', 333333);
> SELECT * FROM students;
name address student_id
------------- ------------------------ ----------
Amy Smith 123 Park Ave, San Jose 111111
Bob Brown 456 Taylor St, Cupertino 222222
Cathy Johnson 789 Race Ave, Palo Alto 333333
Insert using a subquery
-- Assuming the persons table has already been created and populated.
> SELECT * FROM persons;
name address ssn
------------- ------------------------- ---------
Dora Williams 134 Forest Ave, Melo Park 123456789
Eddie Davis 245 Market St, Milpitas 345678901
> INSERT INTO students PARTITION (student_id = 444444)
SELECT name, address FROM persons WHERE name = "Dora Williams";
> SELECT * FROM students;
name address student_id
------------- ------------------------- ----------
Amy Smith 123 Park Ave, San Jose 111111
Bob Brown 456 Taylor St, Cupertino 222222
Cathy Johnson 789 Race Ave, Palo Alto 333333
Dora Williams 134 Forest Ave, Melo Park 444444
Insert using a TABLE
clause
-- Assuming the visiting_students table has already been created and populated.
> SELECT * FROM visiting_students;
name address student_id
------------- --------------------- ----------
Fleur Laurent 345 Copper St, London 777777
Gordon Martin 779 Lake Ave, Oxford 888888
> INSERT INTO students TABLE visiting_students;
> SELECT * FROM students;
name address student_id
------------- ------------------------- ----------
Amy Smith 123 Park Ave,San Jose 111111
Bob Brown 456 Taylor St, Cupertino 222222
Cathy Johnson 789 Race Ave, Palo Alto 333333
Dora Williams 134 Forest Ave, Melo Park 444444
Fleur Laurent 345 Copper St, London 777777
Gordon Martin 779 Lake Ave, Oxford 888888
Insert into a directory
> CREATE TABLE students (name VARCHAR(64), address VARCHAR(64), student_id INT)
PARTITIONED BY (student_id)
LOCATION "/mnt/user1/students";
> INSERT INTO delta.`/mnt/user1/students` VALUES
('Amy Smith', '123 Park Ave, San Jose', 111111);
> SELECT * FROM students;
name address student_id
------------- ------------------------- ----------
Amy Smith 123 Park Ave, San Jose 111111
Insert with a column list
> INSERT INTO students (address, name, student_id) VALUES
('Hangzhou, China', 'Kent Yao', 11215016);
> SELECT * FROM students WHERE name = 'Kent Yao';
name address student_id
--------- ---------------------- ----------
Kent Yao Hangzhou, China 11215016
Insert with both a partition spec and a column list
> INSERT INTO students PARTITION (student_id = 11215017) (address, name) VALUES
('Hangzhou, China', 'Kent Yao Jr.');
> SELECT * FROM students WHERE student_id = 11215017;
name address student_id
------------ ---------------------- ----------
Kent Yao Jr. Hangzhou, China 11215017
INSERT OVERWRITE
Insert using a VALUES
clause
-- Assuming the students table has already been created and populated.
> SELECT * FROM students;
name address student_id
------------- ------------------------- ----------
Amy Smith 123 Park Ave, San Jose 111111
Bob Brown 456 Taylor St, Cupertino 222222
Cathy Johnson 789 Race Ave, Palo Alto 333333
Dora Williams 134 Forest Ave, Melo Park 444444
Fleur Laurent 345 Copper St, London 777777
Gordon Martin 779 Lake Ave, Oxford 888888
Helen Davis 469 Mission St, San Diego 999999
Jason Wang 908 Bird St, Saratoga 121212
> INSERT OVERWRITE students VALUES
('Ashua Hill', '456 Erica Ct, Cupertino', 111111),
('Brian Reed', '723 Kern Ave, Palo Alto', 222222);
> SELECT * FROM students;
name address student_id
---------- ----------------------- ----------
Ashua Hill 456 Erica Ct, Cupertino 111111
Brian Reed 723 Kern Ave, Palo Alto 222222
Insert using a subquery
-- Assuming the persons table has already been created and populated.
> SELECT * FROM persons;
name address ssn
------------- ------------------------- ---------
Dora Williams 134 Forest Ave, Melo Park 123456789
Eddie Davis 245 Market St,Milpitas 345678901
> INSERT OVERWRITE students PARTITION (student_id = 222222)
SELECT name, address FROM persons WHERE name = "Dora Williams";
> SELECT * FROM students;
name address student_id
------------- ------------------------- ----------
Ashua Hill 456 Erica Ct, Cupertino 111111
Dora Williams 134 Forest Ave, Melo Park 222222
Insert using a TABLE
clause
-- Assuming the visiting_students table has already been created and populated.
> SELECT * FROM visiting_students;
name address student_id
------------- --------------------- ----------
Fleur Laurent 345 Copper St, London 777777
Gordon Martin 779 Lake Ave, Oxford 888888
> INSERT OVERWRITE students TABLE visiting_students;
> SELECT * FROM students;
name address student_id
------------- --------------------- ----------
Fleur Laurent 345 Copper St, London 777777
Gordon Martin 779 Lake Ave, Oxford 888888
Insert overwrite a directory
> CREATE TABLE students (name VARCHAR(64), address VARCHAR(64), student_id INT)
PARTITIONED BY (student_id)
LOCATION "/mnt/user1/students";
> INSERT OVERWRITE delta.`/mnt/user1/students` VALUES
('Amy Smith', '123 Park Ave, San Jose', 111111);
> SELECT * FROM students;
name address student_id
------------- ------------------------- ----------
Amy Smith 123 Park Ave, San Jose 111111