Lakebase Data API
Lakebase Postgres (Autoscaling Beta) is available in the following regions: us-east-1, us-west-2, eu-west-1.
This Beta release is the next version of Lakebase, available for evaluation only. For production workloads, use Lakebase Public Preview. See choosing between versions to understand which version is right for you.
The Lakebase Data API is a PostgREST-compatible RESTful interface that allows you to interact directly with your Lakebase Postgres database using standard HTTP methods. It offers API endpoints derived from your database schema, allowing for secure CRUD (Create, Read, Update, Delete) operations on your data without the need for custom backend development.
Overview
The Data API automatically generates RESTful endpoints based on your database schema. Each table in your database becomes accessible through HTTP requests, enabling you to:
- Query data using HTTP GET requests with flexible filtering, sorting, and pagination
- Insert records using HTTP POST requests
- Update records using HTTP PATCH or PUT requests
- Delete records using HTTP DELETE requests
- Execute stored procedures using HTTP POST requests
This approach eliminates the need to write and maintain custom API code, allowing you to focus on your application logic and database schema.
The Lakebase Data API is Databricks' implementation designed to be compatible with the PostgREST specification. This means you can use PostgREST documentation, tools, and client libraries with the Lakebase Data API. Because the Data API is an independent implementation, some PostgREST features that aren't applicable to the Lakebase environment aren't included. For details on feature compatibility, see PostgREST compatibility.
Use cases
The Lakebase Data API is ideal for:
- Web applications: Build frontends that directly interact with your database through HTTP requests
- Microservices: Create lightweight services that access database resources via REST APIs
- Serverless architectures: Integrate with serverless functions and edge computing platforms
- Mobile applications: Provide mobile apps with direct database access through a RESTful interface
- Third-party integrations: Enable external systems to read and write data securely
Set up the Data API
This section guides you through setting up the Data API, from creating required roles to making your first API request.
Prerequisites
Using the Data API requires a Lakebase Postgres (Autoscaling Beta) database project. If you don't have one, see Get started with database projects.
Create the authenticator role
All database access is made through a single Postgres role named authenticator, which requires no permissions except to log in.
Create the authenticator Postgres role on the branch where your database resides. For example, if your Data API will access a database on the production branch, create the role on that branch.
- SQL
- Console
Execute this SQL statement using the Lakebase SQL Editor or any SQL client:
CREATE ROLE authenticator LOGIN NOINHERIT NOCREATEDB NOCREATEROLE NOSUPERUSER;
Create the role through the Lakebase App:
- Navigate to your branch's Roles & Databases tab.
- Click Add role and specify
authenticatoras the role name. - Click Create.
- Copy the generated password and save it securely (you may need it for troubleshooting, though the authenticator role typically doesn't require direct password login).
Enable the Data API
Enable the Data API by executing these SQL statements in your database using the Lakebase SQL Editor or any SQL client.
These statements set up the infrastructure the Data API needs to function by:
- Creating a
pgrstschema to store API configuration - Configuring which database schemas are accessible via the API (
publicby default) - Enabling aggregate functions (like
SUM,COUNT) in API responses - Granting the
authenticatorrolepgrstschema and function permissions
CREATE SCHEMA IF NOT EXISTS pgrst;
CREATE OR REPLACE FUNCTION pgrst.pre_config()
RETURNS VOID AS $$
SELECT
set_config('pgrst.db_schemas', 'public', true),
set_config('pgrst.db_aggregates_enabled', 'true', true)
$$ LANGUAGE SQL;
GRANT USAGE ON SCHEMA pgrst TO authenticator;
GRANT EXECUTE ON ALL FUNCTIONS IN SCHEMA pgrst TO authenticator;
If you need to expose additional schemas (beyond public), modify the pgrst.db_schemas setting to include them: set_config('pgrst.db_schemas', 'public,other_schema', true).
Sample schema (optional)
The examples in this documentation use the following schema. You can create your own tables or use this sample schema for testing. Run these SQL statements using the Lakebase SQL Editor or any SQL client:
-- Create clients table
CREATE TABLE clients (
id SERIAL PRIMARY KEY,
name TEXT NOT NULL,
email TEXT UNIQUE NOT NULL,
company TEXT,
phone TEXT,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-- Create projects table with foreign key to clients
CREATE TABLE projects (
id SERIAL PRIMARY KEY,
name TEXT NOT NULL,
description TEXT,
client_id INTEGER NOT NULL REFERENCES clients(id) ON DELETE CASCADE,
status TEXT DEFAULT 'active',
start_date DATE,
end_date DATE,
budget DECIMAL(10,2),
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-- Create tasks table with foreign key to projects
CREATE TABLE tasks (
id SERIAL PRIMARY KEY,
title TEXT NOT NULL,
description TEXT,
project_id INTEGER NOT NULL REFERENCES projects(id) ON DELETE CASCADE,
status TEXT DEFAULT 'pending',
priority TEXT DEFAULT 'medium',
assigned_to TEXT,
due_date DATE,
estimated_hours DECIMAL(5,2),
actual_hours DECIMAL(5,2),
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-- Insert sample data
INSERT INTO clients (name, email, company, phone) VALUES
('Acme Corp', 'contact@acme.com', 'Acme Corporation', '+1-555-0101'),
('TechStart Inc', 'hello@techstart.com', 'TechStart Inc', '+1-555-0102'),
('Global Solutions', 'info@globalsolutions.com', 'Global Solutions Ltd', '+1-555-0103');
INSERT INTO projects (name, description, client_id, status, start_date, end_date, budget) VALUES
('Website Redesign', 'Complete overhaul of company website with modern design', 1, 'active', '2024-01-15', '2024-06-30', 25000.00),
('Mobile App Development', 'iOS and Android app for customer management', 1, 'planning', '2024-07-01', '2024-12-31', 50000.00),
('Database Migration', 'Migrate legacy system to cloud database', 2, 'active', '2024-02-01', '2024-05-31', 15000.00),
('API Integration', 'Integrate third-party services with existing platform', 3, 'completed', '2023-11-01', '2024-01-31', 20000.00);
INSERT INTO tasks (title, description, project_id, status, priority, assigned_to, due_date, estimated_hours, actual_hours) VALUES
('Design Homepage', 'Create wireframes and mockups for homepage', 1, 'in_progress', 'high', 'Sarah Johnson', '2024-03-15', 16.00, 8.00),
('Setup Development Environment', 'Configure local development setup', 1, 'completed', 'medium', 'Mike Chen', '2024-02-01', 4.00, 3.50),
('Database Schema Design', 'Design new database structure', 3, 'completed', 'high', 'Alex Rodriguez', '2024-02-15', 20.00, 18.00),
('API Authentication', 'Implement OAuth2 authentication flow', 4, 'completed', 'high', 'Lisa Wang', '2024-01-15', 12.00, 10.50),
('User Testing', 'Conduct usability testing with target users', 1, 'pending', 'medium', 'Sarah Johnson', '2024-04-01', 8.00, NULL),
('Performance Optimization', 'Optimize database queries and caching', 3, 'in_progress', 'medium', 'Alex Rodriguez', '2024-04-30', 24.00, 12.00);
Configure user permissions
All Data API requests must be authenticated using Databricks OAuth bearer tokens, which are sent via the Authorization header. Access is restricted to authenticated Databricks identities, with Postgres governing the underlying permissions.
The authenticator role assumes the identity of the requesting user when processing API requests. For this to work, each Databricks identity that accesses the Data API must have a corresponding Postgres role in your database. If you need to add users to your Databricks account first, see Add users to your account.
Add Postgres roles
Use the databricks_auth extension to create Postgres roles that correspond to Databricks identities:
Create the extension:
CREATE EXTENSION IF NOT EXISTS databricks_auth;
Add a Postgres role:
SELECT databricks_create_role('user@databricks.com', 'USER');
For detailed instructions, see Create an OAuth role for a Databricks identity using SQL.
Don't use your database owner account (the Databricks identity who created the Lakebase project) to access the Data API. The authenticator role requires the ability to assume your role, and that permission can't be granted for accounts with elevated privileges.
If you attempt to grant the database owner role to authenticator, you receive this error:
ERROR: permission denied to grant role "db_owner_user@databricks.com"
DETAIL: Only roles with the ADMIN option on role "db_owner_user@databricks.com" may grant this role.
Grant permissions to users
Now that you've created corresponding Postgres roles for your Databricks identities, you need to grant permissions to those Postgres roles. These permissions control which database objects (schemas, tables, sequences, functions) each user can interact with via API requests.
Grant permissions using standard SQL GRANT statements. This example uses the public schema; if you're exposing a different schema, replace public with your schema name:
-- Allow authenticator to assume the identity of the user
GRANT "user@databricks.com" TO authenticator;
-- Allow user@databricks.com to access everything in public schema
GRANT USAGE ON SCHEMA public TO "user@databricks.com";
GRANT SELECT, UPDATE, INSERT, DELETE ON ALL TABLES IN SCHEMA public TO "user@databricks.com";
GRANT USAGE ON ALL SEQUENCES IN SCHEMA public TO "user@databricks.com";
GRANT EXECUTE ON ALL FUNCTIONS IN SCHEMA public TO "user@databricks.com";
This example grants full access to the public schema for the user@databricks.com identity. Replace this with the actual Databricks identity and adjust permissions based on your requirements.
Implement row-level security: The permissions above grant table-level access, but most API use cases require row-level restrictions. For example, in multi-tenant applications, users should only see their own data or their organization's data. Use PostgreSQL row-level security (RLS) policies to enforce fine-grained access control at the database level. See Implement row-level security.
Authentication
To access the Data API, you must provide a Databricks OAuth token in the Authorization header of your HTTP request. The authenticated Databricks identity must have a corresponding Postgres role (created in the previous steps) that defines its database permissions.
Get an OAuth token
Connect to your workspace as the Databricks identity for whom you created a Postgres role in the previous steps and obtain an OAuth token. See Authentication for instructions.
Construct the REST endpoint
The REST endpoint URL is built from several components: your workspace ID, database instance hostname, database name, and schema. The endpoint has this format:
https://${DB_INSTANCE}/api/2.0/workspace/${WORKSPACE_ID}/rest/${DATABASE_NAME}/${SCHEMA}
To get your database instance hostname (DB_INSTANCE), see Connect with an OAuth role. From the Connect dialog, copy the connection string, which contains the hostname. The hostname looks similar to this:
ep-odd-poetry-xxxxxxxx.database.us-west-2.databricks.com
For more information about connection string format, see Understanding connection strings.
Here's an example of a complete REST endpoint:
https://ep-odd-poetry-xxxxxxxx.database.us-west-2.databricks.com/api/2.0/workspace/1983801286941885/rest/databricks_postgres/public
Use this script to store your OAuth token and construct the REST endpoint with your specific values. Save it as export_data_api_vars.sh:
#!/bin/bash
export DBX_OAUTH_TOKEN="<paste the token here>"
WORKSPACE_ID="0000000000000000"
DB_INSTANCE="ep-odd-poetry-xxxxxxxx.database.us-west-2.databricks.com"
DATABASE_NAME="databricks_postgres"
SCHEMA="public"
export REST_ENDPOINT="https://${DB_INSTANCE}/api/2.0/workspace/${WORKSPACE_ID}/rest/${DATABASE_NAME}/${SCHEMA}"
Replace the placeholder values with your actual values. Then run it with source to export the environment variables:
source export_data_api_vars.sh
The source command (or . shorthand) executes the script in your current shell, making the exported variables available for subsequent commands.
Making a request
With your OAuth token and REST endpoint configured, you can make API requests using curl or any HTTP client. The following examples assume you've exported the DBX_OAUTH_TOKEN and REST_ENDPOINT environment variables using the script shown in the tip above.
Here's an example call with the expected output (using the sample clients/projects/tasks schema):
curl -H "Authorization: Bearer $DBX_OAUTH_TOKEN" \
"$REST_ENDPOINT/clients?select=id,name,projects(id,name)&id=gte.2"
Example response:
[
{ "id": 2, "name": "TechStart Inc", "projects": [{ "id": 3, "name": "Database Migration" }] },
{ "id": 3, "name": "Global Solutions", "projects": [{ "id": 4, "name": "API Integration" }] }
]
For detailed capabilities of the API, see the PostgREST API reference. For Lakebase-specific compatibility information, see PostgREST compatibility.
API reference
The following sections provide reference information for using the Data API, including common operations, advanced features, security considerations, and compatibility details.
Basic operations
Query records
Retrieve records from a table using HTTP GET:
curl -H "Authorization: Bearer $DBX_OAUTH_TOKEN" \
"$REST_ENDPOINT/clients"
Example response:
[
{ "id": 1, "name": "Acme Corp", "email": "contact@acme.com", "company": "Acme Corporation", "phone": "+1-555-0101" },
{
"id": 2,
"name": "TechStart Inc",
"email": "hello@techstart.com",
"company": "TechStart Inc",
"phone": "+1-555-0102"
}
]
Filter results
Use query parameters to filter results. This example retrieves clients with id greater than or equal to 2:
curl -H "Authorization: Bearer $DBX_OAUTH_TOKEN" \
"$REST_ENDPOINT/clients?id=gte.2"
Example response:
[
{ "id": 2, "name": "TechStart Inc", "email": "hello@techstart.com" },
{ "id": 3, "name": "Global Solutions", "email": "info@globalsolutions.com" }
]
Select specific columns and join tables
Use the select parameter to retrieve specific columns and join related tables:
curl -H "Authorization: Bearer $DBX_OAUTH_TOKEN" \
"$REST_ENDPOINT/clients?select=id,name,projects(id,name)&id=gte.2"
Example response:
[
{ "id": 2, "name": "TechStart Inc", "projects": [{ "id": 3, "name": "Database Migration" }] },
{ "id": 3, "name": "Global Solutions", "projects": [{ "id": 4, "name": "API Integration" }] }
]
Insert records
Create new records using HTTP POST:
curl -X POST \
-H "Authorization: Bearer $DBX_OAUTH_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"name": "New Client",
"email": "newclient@example.com",
"company": "New Company Inc",
"phone": "+1-555-0104"
}' \
"$REST_ENDPOINT/clients"
Update records
Update existing records using HTTP PATCH:
curl -X PATCH \
-H "Authorization: Bearer $DBX_OAUTH_TOKEN" \
-H "Content-Type: application/json" \
-d '{"phone": "+1-555-0199"}' \
"$REST_ENDPOINT/clients?id=eq.1"
Delete records
Delete records using HTTP DELETE:
curl -X DELETE \
-H "Authorization: Bearer $DBX_OAUTH_TOKEN" \
"$REST_ENDPOINT/clients?id=eq.5"
Advanced features
Pagination
Control the number of records returned using the limit and offset parameters:
curl -H "Authorization: Bearer $DBX_OAUTH_TOKEN" \
"$REST_ENDPOINT/tasks?limit=10&offset=0"
Sorting
Sort results using the order parameter:
curl -H "Authorization: Bearer $DBX_OAUTH_TOKEN" \
"$REST_ENDPOINT/tasks?order=due_date.desc"
Complex filtering
Combine multiple filter conditions:
curl -H "Authorization: Bearer $DBX_OAUTH_TOKEN" \
"$REST_ENDPOINT/tasks?status=eq.in_progress&priority=eq.high"
Common filter operators:
eq- equalsgte- greater than or equallte- less than or equalneq- not equallike- pattern matchingin- matches any value in list
For more information about supported query parameters and API features, see the PostgREST API reference. For Lakebase-specific compatibility information, see PostgREST compatibility.
Row-level security
Row-level security (RLS) policies provide fine-grained access control by restricting which rows users can access in a table.
How RLS works with the Data API: When a user makes an API request, the authenticator role assumes that user's identity. Any RLS policies defined for that user's role are automatically enforced by PostgreSQL, filtering the data they can access. This happens at the database level, so even if application code tries to query all rows, the database only returns rows the user is permitted to see. This provides defense-in-depth security without requiring filtering logic in your application code.
Why RLS is critical for APIs: Unlike direct database connections where you control the connection context, HTTP APIs expose your database to multiple users through a single endpoint. Table-level permissions alone mean that if a user can access the clients table, they can access all client records unless you implement filtering. RLS policies ensure each user automatically sees only their authorized data.
RLS is essential for:
- Multi-tenant applications: Isolate data between different customers or organizations
- User-owned data: Ensure users only access their own records
- Team-based access: Limit visibility to team members or specific groups
- Compliance requirements: Enforce data access restrictions at the database level
Enable RLS on a table and create policies that define access rules:
ALTER TABLE "clients" ENABLE ROW LEVEL SECURITY;
CREATE POLICY "restrict client access" ON "clients"
AS PERMISSIVE FOR SELECT TO "user@databricks.com"
USING (id IN (1, 2));
When RLS is enabled on a given table, roles that are subject to RLS only see rows that match at least one policy; all other rows are filtered out. The example above restricts user@databricks.com to only see rows in the clients table where the id is 1 or 2, even though they have SELECT permission on the entire table.
Common RLS patterns for the Data API:
- User ownership:
USING (user_email = current_user)- Restricts rows to the authenticated user - Tenant isolation:
USING (tenant_id = (SELECT tenant_id FROM user_tenants WHERE user_email = current_user))- Restricts to user's tenant - Team membership:
USING (team_id IN (SELECT team_id FROM user_teams WHERE user_email = current_user))- Restricts to user's teams - Role-based access:
USING (pg_has_role(current_user, required_role, 'member'))- Restricts based on role membership
In Lakebase, current_user returns the authenticated user's email address (for example, user@databricks.com). Use this in your RLS policies to identify which user is making the request.
For comprehensive information about implementing RLS, including policy types, security best practices, and advanced patterns, see the PostgreSQL Row Security Policies documentation.
For more information about permissions, see Manage permissions.
PostgREST compatibility
The Lakebase Data API is compatible with the PostgREST specification. You can:
- Use existing PostgREST client libraries and tools
- Follow PostgREST conventions for filtering, ordering, and pagination
- Adapt documentation and examples from the PostgREST community
The Lakebase Data API is an independent implementation and not based on the open-source PostgREST project. Consequently, it doesn't include certain PostgREST features or configurations that aren't applicable to the Lakebase environment.
For comprehensive details on API features, query parameters, and capabilities, see the PostgREST API reference.
Feature compatibility reference
This section lists PostgREST features that have different behavior or are not supported in the Lakebase Data API.
Authentication and authorization
Feature | Status | Details |
|---|---|---|
JWT configuration | Not applicable | The Lakebase Data API uses Databricks OAuth tokens instead of JWT authentication. JWT-specific configuration options (custom secrets, RS256 keys, audience validation) are not available. |
Resource embedding
Feature | Status | Details |
|---|---|---|
Computed relationships | Not supported | Custom relationships defined through database functions that return |
Inner join embedding ( | Not supported | The |
Response formats
Feature | Status | Details |
|---|---|---|
Custom media type handlers | Not supported | Custom output formats through PostgreSQL aggregates (binary formats, XML, protocol buffers) are not supported. |
Stripped nulls | Not supported | The |
PostGIS GeoJSON | Partially supported | PostGIS geometry columns can be queried, but automatic GeoJSON formatting via |
Pagination and counting
Feature | Status | Details |
|---|---|---|
Planned count | Not supported | The |
Estimated count | Not supported | The |
Request preferences
Feature | Status | Details |
|---|---|---|
Timezone preference | Partially supported | Timezone handling exists, but the |
Transaction control | Not supported | Transaction control via |
Preference handling modes | Not supported | The |
Observability
The Lakebase Data API implements its own observability features. The following PostgREST observability features are not supported:
Feature | Status | Details |
|---|---|---|
Query plan exposure | Not supported | The |
Server-Timing header | Not supported | The Server-Timing header that provides request timing breakdown is not supported. Lakebase implements its own observability features. |
Trace header propagation | Not supported | X-Request-Id and custom trace header propagation for distributed tracing is not supported. Lakebase implements its own observability features. |
Advanced configuration
Feature | Status | Details |
|---|---|---|
Application settings (GUCs) | Not supported | Passing custom configuration values to database functions via PostgreSQL GUCs is not supported. |
Pre-request function | Not supported | The |
For more information about PostgREST features, see the PostgREST documentation.
Security considerations
The Data API enforces your database's security model at multiple levels:
- Authentication: All requests require valid OAuth token authentication
- Role-based access: Database-level permissions control which tables and operations users can access
- Row-level security: RLS policies enforce fine-grained access control, restricting which specific rows users can see or modify
- User context: The API assumes the authenticated user's identity, ensuring database permissions and policies apply correctly
Recommended security practices
For production deployments:
- Implement row-level security: Use RLS policies to restrict data access at the row level. This is especially important for multi-tenant applications and user-owned data. See Row-level security.
- Grant minimal permissions: Only grant the permissions users need (
SELECT,INSERT,UPDATE,DELETE) on specific tables rather than granting broad access. - Use separate roles per application: Create dedicated roles for different applications or services rather than sharing a single role.
- Audit access regularly: Review granted permissions and RLS policies periodically to ensure they match your security requirements.
For information about managing roles and permissions, see: