Copy Into (Delta Lake on Databricks)
Important
This documentation has been retired and might not be updated. The products, services, or technologies mentioned in this content are no longer supported. See COPY INTO.
COPY INTO table_identifier
FROM [ file_location | (SELECT identifier_list FROM file_location) ]
FILEFORMAT = data_source
[FILES = [file_name, ... | PATTERN = 'regex_pattern']
[FORMAT_OPTIONS ('data_source_reader_option' = 'value', ...)]
[COPY_OPTIONS 'force' = ('false'|'true')]
Load data from a file location into a Delta table. This is a re-triable and idempotent operation—files in the source location that have already been loaded are skipped.
table_identifier
The Delta table to copy into.
FROM file_location
The file location to load the data from. Files in this location must have the format specified in FILEFORMAT
.
SELECT identifier_list
Selects the specified columns or expressions from the source data before copying into the Delta table.
FILEFORMAT = data_source
The format of the source files to load. One of CSV
, JSON
, AVRO
, ORC
, PARQUET
.
FILES
A list of file names to load, with length up to 1000. Cannot be specified with PATTERN
.
PATTERN
A regex pattern that identifies the files to load from the source directory. Cannot be specified with FILES
.
FORMAT_OPTIONS
Options to be passed to the Apache Spark data source reader for the specified format.
COPY_OPTIONS
Options to control the operation of the COPY INTO
command. The only option is 'force'
; if set to 'true'
, idempotency is disabled and files are loaded regardless of whether they’ve been loaded before.