Best practices for Databricks Apps
This page lists important best practices for developing and running Databricks Apps. These guidelines focus on security, performance, and platform requirements.
General best practices
-
Use Databricks-native features for data processing. App compute is optimized for UI rendering. Use Databricks SQL for queries and datasets, Lakeflow Jobs for batch processing, and Model Serving for AI inference workloads.
-
Implement graceful shutdown handling. Your app must shut down within 15 seconds after it receives a
SIGTERM
signal, or it forcibly terminates withSIGKILL
. -
Avoid privileged operations. Apps run as non-privileged users and can’t perform actions that require elevated permissions such as root access.
-
Understand platform-managed networking. Requests are forwarded through a reverse proxy, so your app can’t depend on the origin of requests. Databricks handles TLS termination and requires apps to support HTTP/2 cleartext (H2C). Don’t implement custom TLS handling.
-
Bind to the correct host and port. Your app must listen on
0.0.0.0
and use the port specified in theDATABRICKS_APP_PORT
environment variable. See environment variables for details. -
Minimize container startup time. Keep initialization logic lightweight to reduce cold-start latency. Avoid blocking operations like large dependency installs or external API calls during startup. Load heavy resources only when needed.
-
Log to stdout and stderr. Databricks captures logs from standard output and error streams. Use these for all logging to ensure logs are visible in the Databricks UI. Avoid writing logs to local files.
-
Handle unexpected errors gracefully. Implement global exception handling to prevent crashes from uncaught errors. Return proper HTTP error responses without exposing stack traces or sensitive data.
-
Pin dependency versions. Use exact version numbers in your
requirements.txt
file to ensure consistent environments across builds. Avoid using unpinned or latest versions of packages. -
Validate and sanitize user input. Always validate incoming data and sanitize it to prevent injection attacks or malformed inputs, even in internal-facing apps.
-
Use in-memory caching for expensive operations. Cache frequently used data like query results or API responses to reduce latency and avoid redundant processing. Use
functools.lru_cache
,cachetools
, or similar libraries, and scope caches carefully in multi-user apps.
Security best practices
-
Follow the principle of least privilege. Grant only the permissions necessary for each user or group. Use
CAN USE
instead ofCAN MANAGE
unless full control is required. See Best practices for permissions. -
Use dedicated service principals for each app. Don’t share service principal credentials across apps or users. Grant only the minimum permissions needed, such as
CAN USE
orCAN QUERY
. See Manage app access to resources. -
Manage secrets. Never expose raw secret values in environment variables. Use
valueFrom
in your app config and rotate secrets regularly, especially when team roles change. See Best practices for managing secrets. -
Minimize scopes and log user actions. When using user authorization, request only the scopes your app needs, and log all user actions with structured audit records. See Best practices for user authorization.
-
Restrict outbound network access. Allow only the domains your app needs, such as package repositories and external APIs. Use dry-run mode and denial logs to validate your configuration. See Best practices for configuring network policies.
-
Follow secure coding practices. Parameterize SQL queries to prevent injection attacks and apply general secure development guidelines such as input validation and error handling. See Statement Execution API: Run SQL on warehouses.