Skip to main content

PostgreSQL connector reference

Preview

The PostgreSQL connector for Lakeflow Connect is in Public Preview. Reach out to your Databricks account team to enroll in the Public Preview.

This page contains reference material for the PostgreSQL connector in Databricks Lakeflow Connect.

Automatic data transformations

Databricks automatically transforms the following PostgreSQL data types to Delta-compatible data types:

PostgreSQL type

Delta type

BOOLEAN

BOOLEAN

SMALLINT

SMALLINT

INTEGER

INT

BIGINT

BIGINT

DECIMAL(p,s)

DECIMAL(p,s)

NUMERIC(p,s)

DECIMAL(p,s)

REAL

FLOAT

DOUBLE PRECISION

DOUBLE

SMALLSERIAL

SMALLINT

SERIAL

INT

BIGSERIAL

BIGINT

MONEY

STRING

CHAR(n)

STRING

VARCHAR(n)

STRING

TEXT

STRING

BYTEA

BINARY

DATE

DATE

TIME

STRING

TIME WITH TIME ZONE

STRING

TIMESTAMP

STRING

TIMESTAMP WITH TIME ZONE

TIMESTAMP

INTERVAL

STRING

UUID

BINARY

JSON

STRING

JSONB

STRING

XML

STRING

ARRAY

STRING

CIDR

STRING

INET

STRING

MACADDR

STRING

BIT(n)

BINARY

BIT VARYING(n)

BINARY

Important notes about date and time types

  • DATE data type is ingested as the Delta DATE type. The full PostgreSQL date range is supported. Inf and -Inf values are converted to null. BC dates are stored using astronomical year numbering. For example, 1 BC maps to year 0 and 2 BC maps to -1.
  • TIMESTAMP (without time zone) data type is ingested as strings. Inf and -Inf values are preserved as strings.
  • TIMESTAMP WITH TIME ZONE data type is ingested as the Delta TIMESTAMP type. The PostgreSQL-supported range is 4713-01-01 00:00:00.000000 BC to 294276-12-31 23:59:59.999999 AD, whereas the Databricks-supported range is -290308-12-21 BCE 19:59:06 GMT to +294247-01-10 CE 04:00:54 GMT. Timestamps above the Databricks maximum supported timestamp are converted to null. BC dates are stored using astronomical year numbering. For example, 1 BC maps to year 0 and 2 BC maps to -1. Inf and -Inf values are converted to null.
  • TIME, TIME WITH TIME ZONE, and INTERVAL data types are ingested as strings in their original format. You can use Databricks SQL functions to parse these strings into the appropriate time representations if needed. For INTERVAL data type, infinity values are mapped to 0 years 0 mins 0 days 0 hours 0 mins 0.0 secs.

User-defined and third-party data types

User-defined types and data types from third-party PostgreSQL extensions are ingested as strings. If you need to transform these types, you can use Databricks SQL functions to parse the string representations in downstream processing.

Composite types

PostgreSQL composite types (also known as row types) are ingested as strings. The string representation follows PostgreSQL's composite type format: (value1,value2,value3).