Scale to zero

Preview

This feature is in Public Preview in the following regions: us-east-1, us-west-2, eu-west-1.

Lakebase Autoscaling is the new version of Lakebase with autoscaling compute, scale-to-zero, branching, and instant restore. For feature comparison with Lakebase Provisioned, see choosing between versions.

Scale to zero automatically suspends your Lakebase compute after a period of inactivity, minimizing costs for databases that aren't continuously active. This feature is particularly valuable for development, testing, and staging environments, as well as production databases with predictable idle periods.

When scale to zero is enabled:

Your compute automatically suspends after a period of inactivity (default is 5 minutes, minimum is 60 seconds)
You pay only for active compute time, not for idle periods
The compute automatically reactivates within a few hundred milliseconds when you run a new query

This diagram illustrates scale to zero behavior alongside autoscaling, showing an inactive period followed by automatic suspension until the database is accessed again.

Scale to zero visualization

Scale to zero works independently from autoscaling. While autoscaling adjusts compute resources during active periods based on workload demand, scale to zero suspends the compute entirely during inactivity, reducing compute costs to zero.

How scale to zero works

Automatic suspension

When your compute remains idle—receiving no queries or connections—for the configured timeout period, Lakebase automatically suspends it. During suspension:

The compute consumes no resources and incurs no compute costs
Your data remains safely stored and available
Connection strings and credentials remain valid
The compute endpoint remains accessible but inactive

Automatic reactivation

When a new query or connection request arrives at a suspended compute, Lakebase automatically reactivates it. The reactivation process:

Requires no manual intervention
Transparently handles the connection request once active
Restores the compute to its configured minimum size (if autoscaling is enabled)

Applications should implement connection retry logic to handle the brief reactivation period gracefully.

Timeout configuration

You configure the scale-to-zero timeout to control how quickly a compute suspends after becoming idle. The timeout determines the balance between:

Shorter timeouts (60 seconds - 5 minutes): Faster suspension reduces costs but may cause more frequent reactivations for intermittent workloads
Longer timeouts (5 minutes - 1 hour): Fewer reactivations improve user experience for sporadic activity but may increase costs during extended idle periods

The minimum timeout is 60 seconds. The maximum is configurable based on your use case.

Scale to zero benefits

Cost reduction: By suspending inactive computes, you pay only for actual usage time. A development database used 8 hours per day costs one-third as much as an always-active compute.
Flexible deployment: Scale to zero enables cost-effective deployment of multiple environments. You can maintain separate development, testing, staging, and preview environments without incurring 24/7 compute costs for each.
No manual management: The system automatically handles suspension and reactivation, eliminating the need to manually start and stop computes based on usage patterns.
Preserved configuration: All compute settings, connection details, and database configurations remain intact during suspension. When the compute reactivates, it resumes with the same configuration.

Configuring scale to zero

Scale to zero can be enabled or disabled for any compute. When enabled, you configure the inactivity timeout that triggers suspension (default is 5 minutes, minimum is 60 seconds).

A common configuration is for production branches to have scale to zero disabled for continuous availability, while development branches have it enabled to optimize costs.

For detailed instructions on configuring scale-to-zero settings, see Manage computes.

Common scale to zero scenarios

Development and testing environments

Development branches for testing schema changes, validating data pipelines, or experimenting with new features typically see intermittent activity. Scale to zero automatically suspends these computes during evenings, weekends, and between work sessions, significantly reducing costs.

Staging and preview environments

Staging environments used for pre-deployment validation or preview environments created for pull requests often remain idle between testing cycles. Scale to zero ensures these environments consume resources only during active testing periods.

AI agents and applications with idle periods

AI agents, chatbots, or internal tools that serve specific business hours or have predictable downtime patterns can benefit from scale to zero. The compute suspends during off-hours and reactivates automatically when users return.

Multi-tenant application databases

Applications serving multiple customers can use scale to zero for tenant-specific databases. Computes for inactive tenants suspend automatically, reducing aggregate compute costs across all tenants.

Important considerations

Session context reset

When a compute suspends and later reactivates, the session context resets. This includes:

In-memory statistics and cache contents
Temporary tables and prepared statements
Session-specific configuration settings
Connection pools and active transactions

If your application requires persistent session data, consider disabling scale to zero to maintain continuous compute availability.

Startup latency

The brief reactivation period (typically a few hundred milliseconds) may impact user experience for the first query after suspension. For applications requiring immediate response times, you can:

Disable scale to zero for always-available computes
Implement application-level connection warming
Use longer timeout periods to reduce reactivation frequency

Default branch behavior

The default branch (typically production) has scale to zero disabled by default to ensure continuous availability. You can enable it if your production workload has predictable idle periods, but carefully consider the impact on user experience.

Scale to zero and autoscaling

Scale to zero complements autoscaling to optimize both performance and costs:

During active periods: Autoscaling adjusts compute size based on workload demand within your configured range, scaling up during high activity and down during lighter loads.
During inactive periods: After the scale-to-zero timeout, the compute suspends entirely and compute costs drop to zero regardless of the configured autoscaling range.
When reactivated: The compute restarts at the minimum autoscaling size (if autoscaling is enabled), and autoscaling then adjusts resources based on the new workload.

This combination maximizes efficiency: autoscaling optimizes resource usage during activity, while scale to zero eliminates costs during inactivity.

Next steps

Manage computes to learn how to configure scale-to-zero settings
Metrics dashboard to view how metrics reflect inactive compute periods
Autoscaling to understand how computes adjust resources during active periods
Database branches to learn about creating isolated database environments

How scale to zero works​

Automatic suspension​

Automatic reactivation​

Timeout configuration​

Scale to zero benefits​

Configuring scale to zero​

Common scale to zero scenarios​

Development and testing environments​

Staging and preview environments​

AI agents and applications with idle periods​

Multi-tenant application databases​

Important considerations​

Session context reset​

Startup latency​

Default branch behavior​

Scale to zero and autoscaling​

Next steps​