The Databricks SQL create table UI allows you to quickly upload a CSV or TSV file and create a Delta table.
For loading files from cloud storage such as Azure Data Lake Storage Gen2, AWS S3, or Google Cloud Storage, check out the tutorial on COPY INTO.
Create table in Databricks SQL can create managed Delta tables in Unity Catalog or in the Hive Metastore.
Unity Catalog is in Public Preview. To participate in the preview, contact your Databricks representative.
For Unity Catalog, you must also have the
USAGEpermission on the parent catalog of the selected schema.
If your workspace is assigned to a Unity Catalog metastore, you can still create tables under schemas in the Hive metastore.
CREATEpermissions on the schema you want to create a table in.
You must have a running SQL Warehouse.
You can use the UI to create a Delta table by importing small CSV or TSV files to Databricks SQL from your local machine.
The upload UI supports uploading a single file at a time under 100 megabytes.
The file must be a CSV or TSV and have the extension “.csv” or “.tsv”.
Compressed files such as
tarfiles are not supported.
Navigate to the SQL persona by using the persona switcher.
To change the persona, click the icon below the Databricks logo , and select a persona.
Click Create in the sidebar and select Table from the menu.
The Create table in Databricks SQL page appears.
To start an upload, click the file browser button or drag-and-drop files directly on the drop zone.
Imported files are uploaded to a secure internal location within your account which is garbage collected daily.
Upon completion of upload, you can select the destination for your data.
For workspaces that are assigned to a Unity Catalog metastore, you can select a catalog. If your workspace is not assigned to a Unity Catalog metastore, the destination catalog will be hidden, and schemas will be loaded from the Hive metastore.
To use the Hive metastore in a workspace that has been assigned to a Unity Catalog metastore, select
hive_metastorein the catalog selector.
Select a schema.
By default, the UI converts the file name to a valid table name. You can edit the table name.
After the file upload is complete, you can preview the data (limit of 50 rows).
After the upload, the UI tries to start the endpoint selected in the top right. You can switch endpoints at any time, but the preview and table creation require an active endpoint. If your endpoint is not active yet, it starts automatically. This may take some time. The preview starts when your endpoint is running.
There are two ways to preview the data, vertically or horizontally. To switch between preview options, click the toggle button above the table .
Depending on the file format uploaded, different options are available. Common format options appear in the header bar, while less commonly used options are available in the Advanced attributes modal.
For CSV, the following options are available.
First row contains the header (enabled by default): This option specifies whether the CSV/TSV file contains a header.
Column delimiter: The separator character between columns. Only a single character is allowed, and backslash is not supported. This defaults to comma for CSV files.
Automatically detect column types (enabled by default): Automatically detect column types from file content. You can edit types in the preview table. If this is set to false, all column types are inferred as
Rows span multiple lines (disabled by default): Whether a column’s value can span multiple lines in the file.
The data preview updates automatically when you edit format options.
You can edit column header names and types.
To edit types, click the icon with the type.
To edit the column name, click the input box at the top of the column.
Column names do not support commas, backslashes, or unicode characters (such as emojis).
For CSV files, the column data types are inferred by default. You can interpret all columns as
STRING type by disabling Advanced attributes > Automatically detect column types.
Schema inference does a best effort detection of column types. Changing column types may lead to certain values being cast to
NULLif the value cannot be cast correctly to the target data type. Casting
TIMESTAMPcolumns is not supported. Databricks recommends that you create a table first and then transform these columns using SQL functions afterwards.
To support table column names with special characters, create table UI via upload in Databricks SQL leverages Column Mapping.
To add comments to columns, create the table and navigate to Data Explorer where you can add comments.
Create table using CSV upload supports the following data types. For more information about individual data types see SQL data types.
8-byte signed integer numbers.
Values comprising values of fields year, month, and day, without a time-zone.
8-byte double-precision floating point numbers.
Character string values.
Values comprising values of fields year, month, day, hour, minute, and second, with the session local timezone.
To create the table, click Create at the bottom of the page.
After you create the table using Create table in Databricks SQL, the Data Explorer page for the Delta table under the designated catalog and schema appears.