INSERT OVERWRITE DIRECTORY

Applies to: check marked yes Databricks SQL check marked yes Databricks Runtime

Overwrites the existing data in the directory with the new values using a given Spark file format. You specify the inserted row by value expressions or the result of a query.

Syntax

INSERT OVERWRITE [ LOCAL ] DIRECTORY [ directory_path ]
    USING file_format [ OPTIONS ( { key [ = ] val } [ , ... ] ) ]
    { VALUES ( { value | NULL } [ , ... ] ) [ , ( ... ) ] | query }

Parameters

  • directory_path

    The destination directory. It can also be specified in OPTIONS using path. The LOCAL keyword is used to specify that the directory is on the local file system.

  • file_format

    The file format to use for the insert. Valid options are TEXT, CSV, JSON, JDBC, PARQUET, ORC, HIVE, LIBSVM, or a fully qualified class name of a custom implementation of org.apache.spark.sql.execution.datasources.FileFormat.

  • OPTIONS ( { key [ = ] val } [ , … ] )

    Specifies one or more options for the writing of the file format.

  • VALUES ( { value | NULL } [ , … ] ) [ , ( … ) ]

    The values to be inserted. Either an explicitly specified value or a NULL can be inserted. A comma must be used to separate each value in the clause. More than one set of values can be specified to insert multiple rows.

  • query

    A query that produces the rows to be inserted. One of following formats:

    • A SELECT statement

    • A TABLE statement

    • A FROM statement

Examples

INSERT OVERWRITE DIRECTORY '/tmp/destination'
    USING parquet
    OPTIONS (col1 1, col2 2, col3 'test')
    SELECT * FROM test_table;

INSERT OVERWRITE DIRECTORY
    USING parquet
    OPTIONS ('path' '/tmp/destination', col1 1, col2 2, col3 'test')
    SELECT * FROM test_table;