Skip to main content

Typical Lakebase project setup with Terraform

Beta

As of June 15, Lakebase is available in Beta on GCP. See Region availability for supported regions.

This page shows a complete Terraform configuration for a production-ready Lakebase Autoscaling project with the most commonly used features:

  • Protected production branch
  • High availability (HA) read-write endpoint with readable secondaries
  • Service principal with DATABRICKS_SUPERUSER database privileges
  • App-owned Postgres database
  • Postgres database registered in Unity Catalog for Lakehouse Federation queries from Databricks SQL and notebooks
  • Synced table streaming continuously from Unity Catalog
  • Databricks App connected to the Lakebase project

For a step-by-step introduction to Terraform with Lakebase, see Get started with Terraform for Lakebase.

Prerequisites

Before you begin, you must have the following:

Complete configuration

When you create a project, Databricks automatically creates a production branch and a primary read-write endpoint. To configure these implicitly created resources, declare them in Terraform with replace_existing = true. For more details, see databricks_postgres_branch and databricks_postgres_endpoint.

warning

This configuration sets is_protected = true on the production branch and includes a unprotect_for_destroy variable wired into the branch specification. Terraform can't delete a project containing protected branches, and the production branch can't be deleted directly because its lifecycle is controlled by the project. To tear down resources cleanly, use a two-step destroy:

Bash
# Step 1: unprotect the branch
terraform apply -var="unprotect_for_destroy=true"

# Step 2: destroy all resources
terraform destroy -var="unprotect_for_destroy=true"

After running terraform destroy, the project is soft-deleted and retained for 7 days before permanent deletion. To permanently delete it immediately, set purge_on_delete = true on the databricks_postgres_project resource before running destroy.

Hcl
variable "admin_sp_app_id" {
description = "Application ID of the service principal to grant admin access"
type = string
}

variable "unprotect_for_destroy" {
description = "Set to true before destroy to unprotect the production branch"
type = bool
default = false
}

# Project — top-level container for branches, endpoints, databases, and roles.
resource "databricks_postgres_project" "this" {
project_id = "my-lakebase-project"
# purge_on_delete = true # Uncomment to permanently delete on destroy (default: soft delete, 7-day retention).
spec = {
pg_version = 17
display_name = "My Lakebase Project"
default_endpoint_settings = {
autoscaling_limit_min_cu = 0.5
autoscaling_limit_max_cu = 4.0
suspend_timeout_duration = "300s"
}
}
}

# Configure the implicitly created production branch as protected.
resource "databricks_postgres_branch" "production" {
branch_id = "production"
parent = databricks_postgres_project.this.name
spec = {
no_expiry = true
is_protected = var.unprotect_for_destroy ? false : true
}
replace_existing = true
}

# Configure the implicitly created primary endpoint with HA.
# HA requires no_suspension = true. group.min = 2 adds a standby for automatic failover.
resource "databricks_postgres_endpoint" "primary" {
endpoint_id = "primary"
parent = databricks_postgres_branch.production.name
spec = {
endpoint_type = "ENDPOINT_TYPE_READ_WRITE"
autoscaling_limit_min_cu = 0.5
autoscaling_limit_max_cu = 4.0
no_suspension = true
group = {
min = 2
max = 2
enable_readable_secondaries = true
}
}
replace_existing = true
}

# Grant workspace-level CAN_MANAGE on the project to the service principal.
# Use status.project_id (bare ID) not .name (full resource path) — the permissions
# API rejects the full path with a "resource type not found" error.
resource "databricks_permissions" "project" {
database_project_name = databricks_postgres_project.this.status.project_id
access_control {
service_principal_name = var.admin_sp_app_id
permission_level = "CAN_MANAGE"
}
}

# Create a Postgres role backed by the service principal with full database privileges.
# depends_on serializes creation — Lakebase processes one branch operation at a time.
resource "databricks_postgres_role" "admin_sp" {
role_id = "admin-sp"
parent = databricks_postgres_branch.production.name
spec = {
identity_type = "SERVICE_PRINCIPAL"
postgres_role = var.admin_sp_app_id
auth_method = "LAKEBASE_OAUTH_V1"
membership_roles = ["DATABRICKS_SUPERUSER"]
attributes = {
createdb = true
createrole = true
bypassrls = true
}
}
depends_on = [databricks_postgres_endpoint.primary]
}

# Create a Postgres database owned by the admin SP role.
resource "databricks_postgres_database" "app" {
database_id = "app"
parent = databricks_postgres_branch.production.name
spec = {
postgres_database = "app"
role = databricks_postgres_role.admin_sp.name
}
}

# Register the Postgres database in Unity Catalog. This makes the database queryable
# from Databricks SQL and notebooks through Lakehouse Federation, and serves as the
# parent namespace for synced tables that live inside the Lakebase Catalog.
# create_database_if_missing is set explicitly because the database is managed by
# the databricks_postgres_database resource above.
resource "databricks_postgres_catalog" "app_catalog" {
catalog_id = "app_catalog"
spec = {
postgres_database = databricks_postgres_database.app.status.postgres_database
branch = databricks_postgres_branch.production.name
create_database_if_missing = false
}
}

# Sync a Unity Catalog Delta table into the Lakebase database continuously.
# Prefixing synced_table_id with the Lakebase Catalog name places the synced table
# inside the catalog so it's discoverable alongside the rest of the catalog's contents.
# postgres_database references the catalog's status, which implicitly orders this
# resource after the catalog without an explicit depends_on.
resource "databricks_postgres_synced_table" "orders" {
synced_table_id = "app_catalog.default.orders_synced"
spec = {
branch = databricks_postgres_branch.production.name
postgres_database = databricks_postgres_catalog.app_catalog.status.postgres_database
source_table_full_name = "my_catalog.default.orders"
primary_key_columns = ["order_id"]
scheduling_policy = "CONTINUOUS"
create_database_objects_if_missing = true
new_pipeline_spec = {
storage_catalog = "my_catalog"
storage_schema = "default"
}
}
}

# Databricks App connected to the Lakebase project.
# database must be the full resource name (databricks_postgres_database.app.name),
# not the Postgres database name. permission must be "CAN_CONNECT_AND_CREATE".
resource "databricks_app" "this" {
name = "my-lakebase-app"
description = "App backed by Lakebase autoscaling project"
depends_on = [databricks_postgres_database.app]
resources = [{
name = "lakebase-db"
postgres = {
branch = databricks_postgres_branch.production.name
database = databricks_postgres_database.app.name
permission = "CAN_CONNECT_AND_CREATE"
}
}]
}

Additional resources