Copy Into (Delta Lake on Databricks)


This documentation has been retired and might not be updated. The products, services, or technologies mentioned in this content are no longer supported. See COPY INTO.

COPY INTO table_identifier
  FROM [ file_location | (SELECT identifier_list FROM file_location) ]
  FILEFORMAT = data_source
  [FILES = [file_name, ... | PATTERN = 'regex_pattern']
  [FORMAT_OPTIONS ('data_source_reader_option' = 'value', ...)]
  [COPY_OPTIONS 'force' = ('false'|'true')]

Load data from a file location into a Delta table. This is a re-triable and idempotent operation—files in the source location that have already been loaded are skipped.


The Delta table to copy into.

FROM file_location

The file location to load the data from. Files in this location must have the format specified in FILEFORMAT.

SELECT identifier_list

Selects the specified columns or expressions from the source data before copying into the Delta table.

FILEFORMAT = data_source

The format of the source files to load. One of CSV, JSON, AVRO, ORC, PARQUET.


A list of file names to load, with length up to 1000. Cannot be specified with PATTERN.


A regex pattern that identifies the files to load from the source directory. Cannot be specified with FILES.


Options to be passed to the Apache Spark data source reader for the specified format.


Options to control the operation of the COPY INTO command. The only option is 'force'; if set to 'true', idempotency is disabled and files are loaded regardless of whether they’ve been loaded before.


COPY INTO delta.`target_path`
  FROM (SELECT key, index, textData, 'constant_value' FROM 'source_path')
  PATTERN = 'folder1/file_[a-g].csv'
  FORMAT_OPTIONS('header' = 'true')