Skip to main content

PostgreSQL database user requirements

Preview

The PostgreSQL connector for Lakeflow Connect is in Public Preview. Reach out to your Databricks account team to enroll in the Public Preview.

This page describes the privileges required for the PostgreSQL replication user that you create for Databricks ingestion.

Standard PostgreSQL

Required privileges

Where to run the commands

Commands to grant

REPLICATION role

Database cluster (superuser required)

Run the following SQL commands:

SQL
CREATE USER databricks_replication WITH PASSWORD 'your_secure_password';
ALTER USER databricks_replication WITH REPLICATION;

CONNECT privilege on the database

Target database

Run the following SQL command:

SQL
GRANT CONNECT ON DATABASE your_database TO databricks_replication;

USAGE privilege on schemas

Target database

Run the following SQL command for each schema:

SQL
GRANT USAGE ON SCHEMA schema_name TO databricks_replication;

SELECT privilege on tables

Target database

Run the following SQL command for each table or schema:

SQL
-- For specific tables
GRANT SELECT ON TABLE schema_name.table_name TO databricks_replication;

-- For all tables in a schema
GRANT SELECT ON ALL TABLES IN SCHEMA schema_name TO databricks_replication;

-- For future tables in a schema
ALTER DEFAULT PRIVILEGES IN SCHEMA schema_name GRANT SELECT ON TABLES TO databricks_replication;

AWS RDS and Aurora PostgreSQL

Required privileges

Where to run the commands

Commands to grant

rds_replication role

Any database

Run the following SQL commands:

SQL
CREATE USER databricks_replication WITH PASSWORD 'your_secure_password';
GRANT rds_replication TO databricks_replication;

CONNECT privilege on the database

Target database

Run the following SQL command:

SQL
GRANT CONNECT ON DATABASE your_database TO databricks_replication;

USAGE privilege on schemas

Target database

Run the following SQL command for each schema:

SQL
GRANT USAGE ON SCHEMA schema_name TO databricks_replication;

SELECT privilege on tables

Target database

Run the following SQL command for each table or schema:

SQL
-- For specific tables
GRANT SELECT ON TABLE schema_name.table_name TO databricks_replication;

-- For all tables in a schema
GRANT SELECT ON ALL TABLES IN SCHEMA schema_name TO databricks_replication;

-- For future tables in a schema
ALTER DEFAULT PRIVILEGES IN SCHEMA schema_name GRANT SELECT ON TABLES TO databricks_replication;
note

For AWS RDS and Aurora, you don't need superuser privileges. The rds_replication role provides the necessary replication privileges.

Azure Database for PostgreSQL

Required privileges

Where to run the commands

Commands to grant

REPLICATION role

Target database

Run the following SQL commands:

SQL
CREATE USER databricks_replication WITH PASSWORD 'your_secure_password';
ALTER USER databricks_replication WITH REPLICATION;

CONNECT privilege on the database

Target database

Run the following SQL command:

SQL
GRANT CONNECT ON DATABASE your_database TO databricks_replication;

USAGE privilege on schemas

Target database

Run the following SQL command for each schema:

SQL
GRANT USAGE ON SCHEMA schema_name TO databricks_replication;

SELECT privilege on tables

Target database

Run the following SQL command for each table or schema:

SQL
-- For specific tables
GRANT SELECT ON TABLE schema_name.table_name TO databricks_replication;

-- For all tables in a schema
GRANT SELECT ON ALL TABLES IN SCHEMA schema_name TO databricks_replication;

-- For future tables in a schema
ALTER DEFAULT PRIVILEGES IN SCHEMA schema_name GRANT SELECT ON TABLES TO databricks_replication;
note

For Azure Database for PostgreSQL Flexible Server, ensure logical replication is enabled in the server parameters. For Single Server, the azure_replication_support server parameter must be set to REPLICA.

GCP Cloud SQL for PostgreSQL

Required privileges

Where to run the commands

Commands to grant

REPLICATION role

Target database

Run the following SQL commands:

SQL
CREATE USER databricks_replication WITH PASSWORD 'your_secure_password';
ALTER USER databricks_replication WITH REPLICATION;

CONNECT privilege on the database

Target database

Run the following SQL command:

SQL
GRANT CONNECT ON DATABASE your_database TO databricks_replication;

USAGE privilege on schemas

Target database

Run the following SQL command for each schema:

SQL
GRANT USAGE ON SCHEMA schema_name TO databricks_replication;

SELECT privilege on tables

Target database

Run the following SQL command for each table or schema:

SQL
-- For specific tables
GRANT SELECT ON TABLE schema_name.table_name TO databricks_replication;

-- For all tables in a schema
GRANT SELECT ON ALL TABLES IN SCHEMA schema_name TO databricks_replication;

-- For future tables in a schema
ALTER DEFAULT PRIVILEGES IN SCHEMA schema_name GRANT SELECT ON TABLES TO databricks_replication;

System catalog privileges

The replication user requires SELECT privileges on system catalog tables for all PostgreSQL platforms.

Run the following SQL commands on the target database:

SQL
GRANT SELECT ON pg_catalog.pg_publication TO databricks_replication;
GRANT SELECT ON pg_catalog.pg_replication_slots TO databricks_replication;
GRANT SELECT ON pg_catalog.pg_namespace TO databricks_replication;
GRANT SELECT ON pg_catalog.pg_class TO databricks_replication;
GRANT SELECT ON pg_catalog.pg_attribute TO databricks_replication;
GRANT SELECT ON pg_catalog.pg_type TO databricks_replication;
GRANT SELECT ON pg_catalog.pg_index TO databricks_replication;

Verify replication user privileges

After granting the necessary privileges, verify that the replication user has the correct permissions:

SQL
-- Check replication privilege
SELECT rolname, rolreplication FROM pg_roles WHERE rolname = 'databricks_replication';

-- Check table privileges
SELECT grantee, privilege_type
FROM information_schema.table_privileges
WHERE grantee = 'databricks_replication' AND table_schema = 'your_schema';

-- Check schema privileges
SELECT nspname, has_schema_privilege('databricks_replication', nspname, 'USAGE') AS has_usage
FROM pg_namespace
WHERE nspname = 'your_schema';