COPY INTO (Delta Lake on Databricks)
Preview
This feature is in Public Preview.
Loads data from a file location into a Delta table. This is a re-triable and idempotent operation—files in the source location that have already been loaded are skipped.
Syntax
COPY INTO table_identifier
FROM [ file_location | (SELECT identifier_list FROM file_location) ]
FILEFORMAT = data_source
[FILES = [file_name, ... | PATTERN = 'regex_pattern']
[FORMAT_OPTIONS ('data_source_reader_option' = 'value', ...)]
[COPY_OPTIONS 'force' = ('false'|'true')]
table_identifier
[database_name.] table_name
: A table name, optionally qualified with a database name.delta.`<path-to-table>`
: The location of an existing Delta table.
FROM file_location
The file location to load the data from. Files in this location must have the format specified in
FILEFORMAT
.SELECT identifier_list
Selects the specified columns or expressions from the source data before copying into the Delta table.
FILEFORMAT = data_source
The format of the source files to load. One of
CSV
,JSON
,AVRO
,ORC
,PARQUET
.FILES
A list of file names to load, with length up to 1000. Cannot be specified with
PATTERN
.PATTERN
A regex pattern that identifies the files to load from the source directory. Cannot be specified with
FILES
.FORMAT_OPTIONS
Options to be passed to the Apache Spark data source reader for the specified format.
COPY_OPTIONS
Options to control the operation of the
COPY INTO
command. The only option is'force'
; if set to'true'
, idempotency is disabled and files are loaded regardless of whether they’ve been loaded before.