Scale to zero
Lakebase Postgres (Autoscaling Beta) is the next version of Lakebase, available for evaluation only. For production workloads, use Lakebase Public Preview. See choosing between versions to understand which version is right for you.
Scale to zero automatically suspends your Lakebase compute after a period of inactivity, minimizing costs for databases that aren't continuously active. This feature is particularly valuable for development, testing, and staging environments, as well as production databases with predictable idle periods.
When scale to zero is enabled:
- Your compute automatically suspends after a period of inactivity (default is 5 minutes, minimum is 60 seconds)
- You pay only for active compute time, not for idle periods
- The compute automatically reactivates within a few hundred milliseconds when you run a new query
This diagram illustrates scale to zero behavior alongside autoscaling, showing an inactive period followed by automatic suspension until the database is accessed again.

Scale to zero works independently from autoscaling. While autoscaling adjusts compute resources during active periods based on workload demand, scale to zero suspends the compute entirely during inactivity, reducing compute costs to zero.
How scale to zero works
Automatic suspension
When your compute remains idle—receiving no queries or connections—for the configured timeout period, Lakebase automatically suspends it. During suspension:
- The compute consumes no resources and incurs no compute costs
- Your data remains safely stored and available
- Connection strings and credentials remain valid
- The compute endpoint remains accessible but inactive
Automatic reactivation
When a new query or connection request arrives at a suspended compute, Lakebase automatically reactivates it. The reactivation process:
- Requires no manual intervention
- Transparently handles the connection request once active
- Restores the compute to its configured minimum size (if autoscaling is enabled)
Applications should implement connection retry logic to handle the brief reactivation period gracefully.
Timeout configuration
You configure the scale-to-zero timeout to control how quickly a compute suspends after becoming idle. The timeout determines the balance between:
- Shorter timeouts (60 seconds - 5 minutes): Faster suspension reduces costs but may cause more frequent reactivations for intermittent workloads
- Longer timeouts (5 minutes - 1 hour): Fewer reactivations improve user experience for sporadic activity but may increase costs during extended idle periods
The minimum timeout is 60 seconds. The maximum is configurable based on your use case.
Scale to zero benefits
- Cost reduction: By suspending inactive computes, you pay only for actual usage time. A development database used 8 hours per day costs one-third as much as an always-active compute.
- Flexible deployment: Scale to zero enables cost-effective deployment of multiple environments. You can maintain separate development, testing, staging, and preview environments without incurring 24/7 compute costs for each.
- No manual management: The system automatically handles suspension and reactivation, eliminating the need to manually start and stop computes based on usage patterns.
- Preserved configuration: All compute settings, connection details, and database configurations remain intact during suspension. When the compute reactivates, it resumes with the same configuration.
Configuring scale to zero
Scale to zero can be enabled or disabled for any compute. When enabled, you configure the inactivity timeout that triggers suspension (default is 5 minutes, minimum is 60 seconds).
A common configuration is for production branches to have scale to zero disabled for continuous availability, while development branches have it enabled to optimize costs.
For detailed instructions on configuring scale-to-zero settings, see Manage computes.
Common scale to zero scenarios
Development and testing environments
Development branches for testing schema changes, validating data pipelines, or experimenting with new features typically see intermittent activity. Scale to zero automatically suspends these computes during evenings, weekends, and between work sessions, significantly reducing costs.
Staging and preview environments
Staging environments used for pre-deployment validation or preview environments created for pull requests often remain idle between testing cycles. Scale to zero ensures these environments consume resources only during active testing periods.
AI agents and applications with idle periods
AI agents, chatbots, or internal tools that serve specific business hours or have predictable downtime patterns can benefit from scale to zero. The compute suspends during off-hours and reactivates automatically when users return.
Multi-tenant application databases
Applications serving multiple customers can use scale to zero for tenant-specific databases. Computes for inactive tenants suspend automatically, reducing aggregate compute costs across all tenants.
Important considerations
Session context reset
When a compute suspends and later reactivates, the session context resets. This includes:
- In-memory statistics and cache contents
- Temporary tables and prepared statements
- Session-specific configuration settings
- Connection pools and active transactions
If your application requires persistent session data, consider disabling scale to zero to maintain continuous compute availability.
Startup latency
The brief reactivation period (typically a few hundred milliseconds) may impact user experience for the first query after suspension. For applications requiring immediate response times, you can:
- Disable scale to zero for always-available computes
- Implement application-level connection warming
- Use longer timeout periods to reduce reactivation frequency
Default branch behavior
The default branch (typically production) has scale to zero disabled by default to ensure continuous availability. You can enable it if your production workload has predictable idle periods, but carefully consider the impact on user experience.
Scale to zero and autoscaling
Scale to zero complements autoscaling to optimize both performance and costs:
- During active periods: Autoscaling adjusts compute size based on workload demand within your configured range, scaling up during high activity and down during lighter loads.
- During inactive periods: After the scale-to-zero timeout, the compute suspends entirely and compute costs drop to zero regardless of the configured autoscaling range.
- When reactivated: The compute restarts at the minimum autoscaling size (if autoscaling is enabled), and autoscaling then adjusts resources based on the new workload.
This combination maximizes efficiency: autoscaling optimizes resource usage during activity, while scale to zero eliminates costs during inactivity.
Next steps
- Manage computes to learn how to configure scale-to-zero settings
- Autoscaling to understand how computes adjust resources during active periods
- Database branches to learn about creating isolated database environments