Skip to main content

Access and work with a database instance

Preview

This feature is in Public Preview in the following regions: us-east-1, us-west-2, eu-west-1, ap-southeast-1, ap-southeast-2, eu-central-1, us-east-2, ap-south-1.

This page outlines the different ways to work with your database instance and recommends how to optimize PostgreSQL queries.

Ways to access your Database

Here are the following methods to access your database instance to run PostgreSQL queries and manage identities. As a managed Postgres service, there are some limitations in functionality. See Postgres compatibility.

Method

Description

SQL editor

Connect to and query your database from the Databricks SQL editor.

Databricks notebooks

Connect to and query from your database from Databricks notebooks.

External tools

Connect to and query from your database from external tools such as psql, DBeaver, and pgAdmin.

If you query through a Databricks SQL endpoint, the query is federated to a database instance.

Optimize queries

The pg_stat_statements extension provides detailed information on query performance to help identify slow queries and optimize them.

Users can see the query texts of their queries and any roles from which they inherit privileges. If you have ADMIN privilege on a role and you cannot see the query texts for that role, grant yourself membership in that role. If you have no privileges on a role, you cannot see its query texts for security reasons.

Sync a Unity Catalog table into Postgres

Syncing a Unity Catalog table into Postgres enables low-latency read queries and supports query-time joins with other Postgres tables.

Diagram of syncing a Unity Catalog table into Postgres

Key considerations:

  • Data freshness: Continuous sync mode refreshes data from the Unity Catalog managed table to the synced table at a minimum interval of 15 seconds.
  • Update rate: The sync pipeline supports continuous writes at approximately 1,200 rows per second per Capacity Unit (CU), and bulk writes at up to 15,000 rows per second per CU.
  • Table size limit: Each synced table can be up to 2 TB. If you require refreshes instead of full table recreation, the limit is 1 TB.

Use federated queries via DBSQL

Use federated queries via DBSQL to query your Postgres table without syncing your data.

Diagram of using federated queries to query a Postgres table

Key considerations:

  • Latency: Federated queries have higher latency than direct PostgreSQL queries.
  • Throughput: Databricks SQL supports lower queries per second (QPS) compared to querying PostgreSQL directly.
  • Query optimization: For best performance, write queries that filter on the primary key or an indexed column of the Postgres table.