Skip to main content

Lakebase Data API

Beta

Lakebase Postgres (Autoscaling Beta) is available in the following regions: us-east-1, us-west-2, eu-west-1.

This Beta release is the next version of Lakebase, available for evaluation only. For production workloads, use Lakebase Public Preview. See choosing between versions to understand which version is right for you.

The Lakebase Data API is a PostgREST-compatible RESTful interface that allows you to interact directly with your Lakebase Postgres database using standard HTTP methods. It offers API endpoints derived from your database schema, allowing for secure CRUD (Create, Read, Update, Delete) operations on your data without the need for custom backend development.

Overview

The Data API automatically generates RESTful endpoints based on your database schema. Each table in your database becomes accessible through HTTP requests, enabling you to:

  • Query data using HTTP GET requests with flexible filtering, sorting, and pagination
  • Insert records using HTTP POST requests
  • Update records using HTTP PATCH or PUT requests
  • Delete records using HTTP DELETE requests
  • Execute stored procedures using HTTP POST requests

This approach eliminates the need to write and maintain custom API code, allowing you to focus on your application logic and database schema.

note

The Lakebase Data API is Databricks' implementation designed to be compatible with the PostgREST specification. This means you can use PostgREST documentation, tools, and client libraries with the Lakebase Data API. Because the Data API is an independent implementation, some PostgREST features that aren't applicable to the Lakebase environment aren't included. For details on feature compatibility, see PostgREST compatibility.

Use cases

The Lakebase Data API is ideal for:

  • Web applications: Build frontends that directly interact with your database through HTTP requests
  • Microservices: Create lightweight services that access database resources via REST APIs
  • Serverless architectures: Integrate with serverless functions and edge computing platforms
  • Mobile applications: Provide mobile apps with direct database access through a RESTful interface
  • Third-party integrations: Enable external systems to read and write data securely

Set up the Data API

This section guides you through setting up the Data API, from creating required roles to making your first API request.

Prerequisites

Using the Data API requires a Lakebase Postgres (Autoscaling Beta) database project. If you don't have one, see Get started with database projects.

Create the authenticator role

All database access is made through a single Postgres role named authenticator, which requires no permissions except to log in.

Create the authenticator Postgres role on the branch where your database resides. For example, if your Data API will access a database on the production branch, create the role on that branch.

Execute this SQL statement using the Lakebase SQL Editor or any SQL client:

PostgreSQL
CREATE ROLE authenticator LOGIN NOINHERIT NOCREATEDB NOCREATEROLE NOSUPERUSER;

Enable the Data API

Enable the Data API by executing these SQL statements in your database using the Lakebase SQL Editor or any SQL client.

These statements set up the infrastructure the Data API needs to function by:

  • Creating a pgrst schema to store API configuration
  • Configuring which database schemas are accessible via the API (public by default)
  • Enabling aggregate functions (like SUM, COUNT) in API responses
  • Granting the authenticator role pgrst schema and function permissions
PostgreSQL
CREATE SCHEMA IF NOT EXISTS pgrst;

CREATE OR REPLACE FUNCTION pgrst.pre_config()
RETURNS VOID AS $$
SELECT
set_config('pgrst.db_schemas', 'public', true),
set_config('pgrst.db_aggregates_enabled', 'true', true)
$$ LANGUAGE SQL;

GRANT USAGE ON SCHEMA pgrst TO authenticator;
GRANT EXECUTE ON ALL FUNCTIONS IN SCHEMA pgrst TO authenticator;
note

If you need to expose additional schemas (beyond public), modify the pgrst.db_schemas setting to include them: set_config('pgrst.db_schemas', 'public,other_schema', true).

Sample schema (optional)

The examples in this documentation use the following schema. You can create your own tables or use this sample schema for testing. Run these SQL statements using the Lakebase SQL Editor or any SQL client:

PostgreSQL
-- Create clients table
CREATE TABLE clients (
id SERIAL PRIMARY KEY,
name TEXT NOT NULL,
email TEXT UNIQUE NOT NULL,
company TEXT,
phone TEXT,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Create projects table with foreign key to clients
CREATE TABLE projects (
id SERIAL PRIMARY KEY,
name TEXT NOT NULL,
description TEXT,
client_id INTEGER NOT NULL REFERENCES clients(id) ON DELETE CASCADE,
status TEXT DEFAULT 'active',
start_date DATE,
end_date DATE,
budget DECIMAL(10,2),
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Create tasks table with foreign key to projects
CREATE TABLE tasks (
id SERIAL PRIMARY KEY,
title TEXT NOT NULL,
description TEXT,
project_id INTEGER NOT NULL REFERENCES projects(id) ON DELETE CASCADE,
status TEXT DEFAULT 'pending',
priority TEXT DEFAULT 'medium',
assigned_to TEXT,
due_date DATE,
estimated_hours DECIMAL(5,2),
actual_hours DECIMAL(5,2),
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Insert sample data
INSERT INTO clients (name, email, company, phone) VALUES
('Acme Corp', 'contact@acme.com', 'Acme Corporation', '+1-555-0101'),
('TechStart Inc', 'hello@techstart.com', 'TechStart Inc', '+1-555-0102'),
('Global Solutions', 'info@globalsolutions.com', 'Global Solutions Ltd', '+1-555-0103');

INSERT INTO projects (name, description, client_id, status, start_date, end_date, budget) VALUES
('Website Redesign', 'Complete overhaul of company website with modern design', 1, 'active', '2024-01-15', '2024-06-30', 25000.00),
('Mobile App Development', 'iOS and Android app for customer management', 1, 'planning', '2024-07-01', '2024-12-31', 50000.00),
('Database Migration', 'Migrate legacy system to cloud database', 2, 'active', '2024-02-01', '2024-05-31', 15000.00),
('API Integration', 'Integrate third-party services with existing platform', 3, 'completed', '2023-11-01', '2024-01-31', 20000.00);

INSERT INTO tasks (title, description, project_id, status, priority, assigned_to, due_date, estimated_hours, actual_hours) VALUES
('Design Homepage', 'Create wireframes and mockups for homepage', 1, 'in_progress', 'high', 'Sarah Johnson', '2024-03-15', 16.00, 8.00),
('Setup Development Environment', 'Configure local development setup', 1, 'completed', 'medium', 'Mike Chen', '2024-02-01', 4.00, 3.50),
('Database Schema Design', 'Design new database structure', 3, 'completed', 'high', 'Alex Rodriguez', '2024-02-15', 20.00, 18.00),
('API Authentication', 'Implement OAuth2 authentication flow', 4, 'completed', 'high', 'Lisa Wang', '2024-01-15', 12.00, 10.50),
('User Testing', 'Conduct usability testing with target users', 1, 'pending', 'medium', 'Sarah Johnson', '2024-04-01', 8.00, NULL),
('Performance Optimization', 'Optimize database queries and caching', 3, 'in_progress', 'medium', 'Alex Rodriguez', '2024-04-30', 24.00, 12.00);

Configure user permissions

All Data API requests must be authenticated using Databricks OAuth bearer tokens, which are sent via the Authorization header. Access is restricted to authenticated Databricks identities, with Postgres governing the underlying permissions.

The authenticator role assumes the identity of the requesting user when processing API requests. For this to work, each Databricks identity that accesses the Data API must have a corresponding Postgres role in your database. If you need to add users to your Databricks account first, see Add users to your account.

Add Postgres roles

Use the databricks_auth extension to create Postgres roles that correspond to Databricks identities:

Create the extension:

PostgreSQL
CREATE EXTENSION IF NOT EXISTS databricks_auth;

Add a Postgres role:

PostgreSQL
SELECT databricks_create_role('user@databricks.com', 'USER');

For detailed instructions, see Create an OAuth role for a Databricks identity using SQL.

important

Don't use your database owner account (the Databricks identity who created the Lakebase project) to access the Data API. The authenticator role requires the ability to assume your role, and that permission can't be granted for accounts with elevated privileges.

If you attempt to grant the database owner role to authenticator, you receive this error:

ERROR:  permission denied to grant role "db_owner_user@databricks.com"
DETAIL: Only roles with the ADMIN option on role "db_owner_user@databricks.com" may grant this role.

Grant permissions to users

Now that you've created corresponding Postgres roles for your Databricks identities, you need to grant permissions to those Postgres roles. These permissions control which database objects (schemas, tables, sequences, functions) each user can interact with via API requests.

Grant permissions using standard SQL GRANT statements. This example uses the public schema; if you're exposing a different schema, replace public with your schema name:

PostgreSQL
-- Allow authenticator to assume the identity of the user
GRANT "user@databricks.com" TO authenticator;

-- Allow user@databricks.com to access everything in public schema
GRANT USAGE ON SCHEMA public TO "user@databricks.com";
GRANT SELECT, UPDATE, INSERT, DELETE ON ALL TABLES IN SCHEMA public TO "user@databricks.com";
GRANT USAGE ON ALL SEQUENCES IN SCHEMA public TO "user@databricks.com";
GRANT EXECUTE ON ALL FUNCTIONS IN SCHEMA public TO "user@databricks.com";

This example grants full access to the public schema for the user@databricks.com identity. Replace this with the actual Databricks identity and adjust permissions based on your requirements.

important

Implement row-level security: The permissions above grant table-level access, but most API use cases require row-level restrictions. For example, in multi-tenant applications, users should only see their own data or their organization's data. Use PostgreSQL row-level security (RLS) policies to enforce fine-grained access control at the database level. See Implement row-level security.

Authentication

To access the Data API, you must provide a Databricks OAuth token in the Authorization header of your HTTP request. The authenticated Databricks identity must have a corresponding Postgres role (created in the previous steps) that defines its database permissions.

Get an OAuth token

Connect to your workspace as the Databricks identity for whom you created a Postgres role in the previous steps and obtain an OAuth token. See Authentication for instructions.

Construct the REST endpoint

The REST endpoint URL is built from several components: your workspace ID, database instance hostname, database name, and schema. The endpoint has this format:

https://${DB_INSTANCE}/api/2.0/workspace/${WORKSPACE_ID}/rest/${DATABASE_NAME}/${SCHEMA}

To get your database instance hostname (DB_INSTANCE), see Connect with an OAuth role. From the Connect dialog, copy the connection string, which contains the hostname. The hostname looks similar to this:

ep-odd-poetry-xxxxxxxx.database.us-west-2.databricks.com

For more information about connection string format, see Understanding connection strings.

Here's an example of a complete REST endpoint:

https://ep-odd-poetry-xxxxxxxx.database.us-west-2.databricks.com/api/2.0/workspace/1983801286941885/rest/databricks_postgres/public
tip

Use this script to store your OAuth token and construct the REST endpoint with your specific values. Save it as export_data_api_vars.sh:

export_data_api_vars.sh
#!/bin/bash

export DBX_OAUTH_TOKEN="<paste the token here>"

WORKSPACE_ID="0000000000000000"
DB_INSTANCE="ep-odd-poetry-xxxxxxxx.database.us-west-2.databricks.com"
DATABASE_NAME="databricks_postgres"
SCHEMA="public"

export REST_ENDPOINT="https://${DB_INSTANCE}/api/2.0/workspace/${WORKSPACE_ID}/rest/${DATABASE_NAME}/${SCHEMA}"

Replace the placeholder values with your actual values. Then run it with source to export the environment variables:

Bash
source export_data_api_vars.sh

The source command (or . shorthand) executes the script in your current shell, making the exported variables available for subsequent commands.

Making a request

With your OAuth token and REST endpoint configured, you can make API requests using curl or any HTTP client. The following examples assume you've exported the DBX_OAUTH_TOKEN and REST_ENDPOINT environment variables using the script shown in the tip above.

Here's an example call with the expected output (using the sample clients/projects/tasks schema):

Bash
curl -H "Authorization: Bearer $DBX_OAUTH_TOKEN" \
"$REST_ENDPOINT/clients?select=id,name,projects(id,name)&id=gte.2"

Example response:

JSON
[
{ "id": 2, "name": "TechStart Inc", "projects": [{ "id": 3, "name": "Database Migration" }] },
{ "id": 3, "name": "Global Solutions", "projects": [{ "id": 4, "name": "API Integration" }] }
]

For detailed capabilities of the API, see the PostgREST API reference. For Lakebase-specific compatibility information, see PostgREST compatibility.

API reference

The following sections provide reference information for using the Data API, including common operations, advanced features, security considerations, and compatibility details.

Basic operations

Query records

Retrieve records from a table using HTTP GET:

Bash
curl -H "Authorization: Bearer $DBX_OAUTH_TOKEN" \
"$REST_ENDPOINT/clients"

Example response:

JSON
[
{ "id": 1, "name": "Acme Corp", "email": "contact@acme.com", "company": "Acme Corporation", "phone": "+1-555-0101" },
{
"id": 2,
"name": "TechStart Inc",
"email": "hello@techstart.com",
"company": "TechStart Inc",
"phone": "+1-555-0102"
}
]

Filter results

Use query parameters to filter results. This example retrieves clients with id greater than or equal to 2:

Bash
curl -H "Authorization: Bearer $DBX_OAUTH_TOKEN" \
"$REST_ENDPOINT/clients?id=gte.2"

Example response:

JSON
[
{ "id": 2, "name": "TechStart Inc", "email": "hello@techstart.com" },
{ "id": 3, "name": "Global Solutions", "email": "info@globalsolutions.com" }
]

Select specific columns and join tables

Use the select parameter to retrieve specific columns and join related tables:

Bash
curl -H "Authorization: Bearer $DBX_OAUTH_TOKEN" \
"$REST_ENDPOINT/clients?select=id,name,projects(id,name)&id=gte.2"

Example response:

JSON
[
{ "id": 2, "name": "TechStart Inc", "projects": [{ "id": 3, "name": "Database Migration" }] },
{ "id": 3, "name": "Global Solutions", "projects": [{ "id": 4, "name": "API Integration" }] }
]

Insert records

Create new records using HTTP POST:

Bash
curl -X POST \
-H "Authorization: Bearer $DBX_OAUTH_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"name": "New Client",
"email": "newclient@example.com",
"company": "New Company Inc",
"phone": "+1-555-0104"
}' \
"$REST_ENDPOINT/clients"

Update records

Update existing records using HTTP PATCH:

Bash
curl -X PATCH \
-H "Authorization: Bearer $DBX_OAUTH_TOKEN" \
-H "Content-Type: application/json" \
-d '{"phone": "+1-555-0199"}' \
"$REST_ENDPOINT/clients?id=eq.1"

Delete records

Delete records using HTTP DELETE:

Bash
curl -X DELETE \
-H "Authorization: Bearer $DBX_OAUTH_TOKEN" \
"$REST_ENDPOINT/clients?id=eq.5"

Advanced features

Pagination

Control the number of records returned using the limit and offset parameters:

Bash
curl -H "Authorization: Bearer $DBX_OAUTH_TOKEN" \
"$REST_ENDPOINT/tasks?limit=10&offset=0"

Sorting

Sort results using the order parameter:

Bash
curl -H "Authorization: Bearer $DBX_OAUTH_TOKEN" \
"$REST_ENDPOINT/tasks?order=due_date.desc"

Complex filtering

Combine multiple filter conditions:

Bash
curl -H "Authorization: Bearer $DBX_OAUTH_TOKEN" \
"$REST_ENDPOINT/tasks?status=eq.in_progress&priority=eq.high"

Common filter operators:

  • eq - equals
  • gte - greater than or equal
  • lte - less than or equal
  • neq - not equal
  • like - pattern matching
  • in - matches any value in list

For more information about supported query parameters and API features, see the PostgREST API reference. For Lakebase-specific compatibility information, see PostgREST compatibility.

Row-level security

Row-level security (RLS) policies provide fine-grained access control by restricting which rows users can access in a table.

How RLS works with the Data API: When a user makes an API request, the authenticator role assumes that user's identity. Any RLS policies defined for that user's role are automatically enforced by PostgreSQL, filtering the data they can access. This happens at the database level, so even if application code tries to query all rows, the database only returns rows the user is permitted to see. This provides defense-in-depth security without requiring filtering logic in your application code.

Why RLS is critical for APIs: Unlike direct database connections where you control the connection context, HTTP APIs expose your database to multiple users through a single endpoint. Table-level permissions alone mean that if a user can access the clients table, they can access all client records unless you implement filtering. RLS policies ensure each user automatically sees only their authorized data.

RLS is essential for:

  • Multi-tenant applications: Isolate data between different customers or organizations
  • User-owned data: Ensure users only access their own records
  • Team-based access: Limit visibility to team members or specific groups
  • Compliance requirements: Enforce data access restrictions at the database level

Enable RLS on a table and create policies that define access rules:

PostgreSQL
ALTER TABLE "clients" ENABLE ROW LEVEL SECURITY;

CREATE POLICY "restrict client access" ON "clients"
AS PERMISSIVE FOR SELECT TO "user@databricks.com"
USING (id IN (1, 2));

When RLS is enabled on a given table, roles that are subject to RLS only see rows that match at least one policy; all other rows are filtered out. The example above restricts user@databricks.com to only see rows in the clients table where the id is 1 or 2, even though they have SELECT permission on the entire table.

Common RLS patterns for the Data API:

  • User ownership: USING (user_email = current_user) - Restricts rows to the authenticated user
  • Tenant isolation: USING (tenant_id = (SELECT tenant_id FROM user_tenants WHERE user_email = current_user)) - Restricts to user's tenant
  • Team membership: USING (team_id IN (SELECT team_id FROM user_teams WHERE user_email = current_user)) - Restricts to user's teams
  • Role-based access: USING (pg_has_role(current_user, required_role, 'member')) - Restricts based on role membership
note

In Lakebase, current_user returns the authenticated user's email address (for example, user@databricks.com). Use this in your RLS policies to identify which user is making the request.

For comprehensive information about implementing RLS, including policy types, security best practices, and advanced patterns, see the PostgreSQL Row Security Policies documentation.

For more information about permissions, see Manage permissions.

PostgREST compatibility

The Lakebase Data API is compatible with the PostgREST specification. You can:

  • Use existing PostgREST client libraries and tools
  • Follow PostgREST conventions for filtering, ordering, and pagination
  • Adapt documentation and examples from the PostgREST community
note

The Lakebase Data API is an independent implementation and not based on the open-source PostgREST project. Consequently, it doesn't include certain PostgREST features or configurations that aren't applicable to the Lakebase environment.

For comprehensive details on API features, query parameters, and capabilities, see the PostgREST API reference.

Feature compatibility reference

This section lists PostgREST features that have different behavior or are not supported in the Lakebase Data API.

Authentication and authorization

Feature

Status

Details

JWT configuration

Not applicable

The Lakebase Data API uses Databricks OAuth tokens instead of JWT authentication. JWT-specific configuration options (custom secrets, RS256 keys, audience validation) are not available.

Resource embedding

Feature

Status

Details

Computed relationships

Not supported

Custom relationships defined through database functions that return SETOF or single records are not supported. Only foreign key relationships can be embedded.

Inner join embedding (!inner hint)

Not supported

The !inner hint that converts left joins to inner joins for filtering parent rows based on child criteria is not supported. Example: ?select=*,clients!inner(*)&clients.id=eq.1

Response formats

Feature

Status

Details

Custom media type handlers

Not supported

Custom output formats through PostgreSQL aggregates (binary formats, XML, protocol buffers) are not supported.

Stripped nulls

Not supported

The nulls=stripped media type parameter that removes null fields from JSON responses is not supported. Example: Accept: application/vnd.pgrst.object+json;nulls=stripped

PostGIS GeoJSON

Partially supported

PostGIS geometry columns can be queried, but automatic GeoJSON formatting via Accept: application/geo+json header is not available.

Pagination and counting

Feature

Status

Details

Planned count

Not supported

The Prefer: count=planned option that uses PostgreSQL's query planner to estimate result counts is not supported. Use Prefer: count=exact instead.

Estimated count

Not supported

The Prefer: count=estimated option that uses PostgreSQL statistics for approximate counts is not supported. Use Prefer: count=exact instead.

Request preferences

Feature

Status

Details

Timezone preference

Partially supported

Timezone handling exists, but the Prefer: timezone=America/Los_Angeles header for overriding server timezone is not supported. Configure server timezone or use database-level timezone functions.

Transaction control

Not supported

Transaction control via Prefer: tx=commit and Prefer: tx=rollback headers is not supported.

Preference handling modes

Not supported

The Prefer: handling=strict and Prefer: handling=lenient options for controlling how invalid preferences are handled are not supported.

Observability

The Lakebase Data API implements its own observability features. The following PostgREST observability features are not supported:

Feature

Status

Details

Query plan exposure

Not supported

The Accept: application/vnd.pgrst.plan+json header that exposes PostgreSQL EXPLAIN output for performance analysis is not supported.

Server-Timing header

Not supported

The Server-Timing header that provides request timing breakdown is not supported. Lakebase implements its own observability features.

Trace header propagation

Not supported

X-Request-Id and custom trace header propagation for distributed tracing is not supported. Lakebase implements its own observability features.

Advanced configuration

Feature

Status

Details

Application settings (GUCs)

Not supported

Passing custom configuration values to database functions via PostgreSQL GUCs is not supported.

Pre-request function

Not supported

The db-pre-request configuration that allows specifying a database function to run before each request is not supported.

For more information about PostgREST features, see the PostgREST documentation.

Security considerations

The Data API enforces your database's security model at multiple levels:

  • Authentication: All requests require valid OAuth token authentication
  • Role-based access: Database-level permissions control which tables and operations users can access
  • Row-level security: RLS policies enforce fine-grained access control, restricting which specific rows users can see or modify
  • User context: The API assumes the authenticated user's identity, ensuring database permissions and policies apply correctly

For production deployments:

  1. Implement row-level security: Use RLS policies to restrict data access at the row level. This is especially important for multi-tenant applications and user-owned data. See Row-level security.
  2. Grant minimal permissions: Only grant the permissions users need (SELECT, INSERT, UPDATE, DELETE) on specific tables rather than granting broad access.
  3. Use separate roles per application: Create dedicated roles for different applications or services rather than sharing a single role.
  4. Audit access regularly: Review granted permissions and RLS policies periodically to ensure they match your security requirements.

For information about managing roles and permissions, see: