Horizontal scaling for Databricks apps
This feature is in Beta. To request access, reach out to your Databricks representative. After the feature is enabled, workspace admins can control access to it from the Previews page. See Manage Databricks previews.
This page describes how to run a Databricks app across multiple instances behind a single app URL for higher availability and concurrency.
Horizontal scaling distributes requests across instances, so a single instance failure or restart doesn't take the app offline. Compute cost scales linearly with the number of instances.
Additional benefits include:
- Session affinity: Every request from the same user routes to the same instance on a best-effort basis, so an app can keep short-lived per-user data (for example, an in-memory cache) on that instance. Also known as sticky sessions. See Session affinity.
- Zero-downtime deployments: Databricks rolls a new deployment out to a build instance first, and only updates the remaining instances after that build instance succeeds. The existing instances continue to serve traffic throughout.
- Stable build caches: Databricks preserves deployment artifacts across compute updates (for example, when you change instance size or count), so the app doesn't require a full rebuild.
Requirements
The following requirements apply to all horizontally scaled apps:
- During the Beta period, your app must listen on
0.0.0.0(not127.0.0.1orlocalhost).
Create a horizontally scaled app
To create an app with horizontal scaling enabled:
- In your Databricks workspace, click the
app switcher and select Databricks Apps.
- Click + Create app, then click Create a custom app.
- Enter a name and configure the app as described in Create a custom Databricks app.
- In the Configure step, select Enable horizontal scaling.
- Specify the Number of instances (1–5). Databricks recommends at least 2 instances for availability.
- Click Create app.
Enabling horizontal scaling for one app doesn't affect any other apps in the workspace.
Newly created horizontally scaled apps don't include the pre-installed Python libraries. Declare all dependencies in requirements.txt or pyproject.toml. See Manage dependencies for a Databricks app.
Convert a standard app to use horizontal scaling
Convert an existing standard app to a horizontally scaled app from the Settings tab. Conversion introduces no downtime. Your existing standard app keeps serving traffic until the new horizontally scaled deployment passes health checks, at which point Databricks cuts traffic over.
Conversion is reversible. You can convert a horizontally scaled app back to a standard app from the same Settings tab.
Conversion behavior
The following behavior applies at conversion time:
- The instance count is set to 1. After conversion, you can change the count (up to 5). See Manage the instance count.
- The converted app keeps using the pre-installed Python libraries so existing dependencies continue to work. To run on a clean base OS image instead, opt out after conversion. See Opt out of pre-installed Python libraries for Databricks apps.
Test the conversion
Conversion changes your app's runtime image and scaling model. Before converting a production app, validate the change on a duplicate:
- Create a duplicate of the standard app you want to convert.
- Convert the duplicate using the following steps.
- Verify the app works as expected on the converted duplicate. Check logs, traffic, and any session-affinity behavior.
- Convert the production app after validation.
Convert a standard app
To convert a standard app to horizontally scaled:
- In your Databricks workspace, click the
app switcher and select Databricks Apps.
- Click the name of the app you want to convert.
- Click the Settings tab.
- Under Compute, select Enable horizontal scaling. Databricks sets the instance count to 1, which is required at conversion time.
- Click Save.
Databricks spins up new horizontally scaled compute and then redeploys your app onto it. Total time depends on your app's build duration. The existing app keeps serving traffic throughout.
After conversion finishes, you can scale up or opt out of pre-installed libraries.
Manage the instance count
To change the number of instances for a horizontally scaled app:
- On the app details page, click Edit.
- In the Configure step, update the Number of instances.
- Click Save.
The app continues to serve traffic while Databricks applies the change.
Session affinity
Session affinity routes every request from the same user to the same instance whenever possible. An app can use this routing to keep short-lived per-user data (for example, an in-memory cache or temporary files on the local filesystem) on the instance that handles that user's session. Session affinity is also known as sticky sessions.
Session affinity is best-effort. Don't store anything in instance-local state that you can't reconstruct or fetch from a durable store. Persist any data that must outlive a session in a durable store such as Unity Catalog tables.
You can also persist data with Lakebase.
Browser-based apps
Session affinity works automatically for browser-based apps. The first request from a browser routes to a randomly selected instance, which sets the __Host-databricks-app-router cookie. Subsequent requests with that cookie route to the same instance.
API clients
To get session affinity for API clients, include the __Host-databricks-app-router cookie in every request and set it to a randomly generated UUID. All requests with the same cookie value route to the same instance.
curl -X GET https://<your-app>.aws.databricksapps.com/api/endpoint \
-H "Authorization: Bearer YOUR_BEARER_TOKEN" \
-b "__Host-databricks-app-router=f8822466-3b1e-423a-988b-54c9e639c250"
Session reassignment
Sessions can move to a different instance in the following cases:
- An instance becomes unavailable (for example, during deployment or a crash).
- You change the instance count.
- Databricks rebalances sessions across instances.
- An individual request is misrouted (rare).
Limitations
The following limitations apply to horizontally scaled apps:
- Each workspace can have at most 5 horizontally scaled apps. Contact your Databricks representative to raise this limit.
- Each horizontally scaled app can have at most 5 instances. Contact your Databricks representative to raise this limit.
- The Logs tab shows logs from a single instance at a time. To view logs across all instances, enable app telemetry.