Best practices for Databricks Apps

This page lists important best practices for developing and running Databricks Apps. These guidelines focus on security, performance, and platform requirements.

General best practices

Use Databricks-native features for data processing. App compute is optimized for UI rendering. Use Databricks SQL for queries and datasets, Lakeflow Jobs for batch processing, and Model Serving for AI inference workloads. Offload heavy data processing to these services to avoid performance issues. Test your app under expected load conditions to verify that it meets your requirements.
Implement graceful shutdown handling. Your app must shut down within 15 seconds after it receives a SIGTERM signal, or it forcibly terminates with SIGKILL.
Avoid privileged operations. Apps run as non-privileged users and can't perform actions that require elevated permissions such as root access. You can't install system-level packages using package managers like apt-get, yum, or apk. Instead, use Python packages from PyPI or Node.js packages from npm to manage your app dependencies.
Understand platform-managed networking. Requests are forwarded through a reverse proxy, so your app can’t depend on the origin of requests. Databricks handles TLS termination and requires apps to support HTTP/2 cleartext (H2C). Don’t implement custom TLS handling.
Bind to the correct host and port. Your app must listen on 0.0.0.0 and use the port specified in the DATABRICKS_APP_PORT environment variable. See environment variables for details.
Minimize container startup time. Keep initialization logic lightweight to reduce cold-start latency. Avoid blocking operations like large dependency installs or external API calls during startup. Load heavy resources only when needed.
Log to stdout and stderr. Databricks captures logs from standard output and error streams. Use these for all logging to ensure logs are visible in the Databricks UI. Avoid writing logs to local files.
Handle unexpected errors gracefully. Implement global exception handling to prevent crashes from uncaught errors. Return proper HTTP error responses without exposing stack traces or sensitive data.
Pin dependency versions. Use exact version numbers in your requirements.txt file to ensure consistent environments across builds. Avoid using unpinned or latest versions of packages.
Validate and sanitize user input. Always validate incoming data and sanitize it to prevent injection attacks or malformed inputs, even in internal-facing apps.
Use in-memory caching for expensive operations. Cache frequently used data like query results or API responses to reduce latency and avoid redundant processing. Use functools.lru_cache, cachetools, or similar libraries, and scope caches carefully in multi-user apps.
Use asynchronous request patterns for long-running operations. Avoid synchronous requests that wait for operations to complete, which can time out. Instead, make an initial request to start the operation, then periodically query the resource state or endpoint to check completion status.

Security best practices

Follow the principle of least privilege. Grant only the permissions necessary for each user or group. Use CAN USE instead of CAN MANAGE unless full control is required. See Best practices for permissions.
Choose authentication methods carefully. Use service principals when access to resources and data is the same for all users of the app. Only implement user authentication in workspaces with trusted app authors and peer-reviewed app code, when the app must respect the calling user's permissions.
Use dedicated service principals for each app. Don't share service principal credentials across apps or users. Grant only the minimum permissions needed, such as CAN USE or CAN QUERY. Rotate service principal credentials when app creators leave your organization. See Manage app access to resources.
Isolate app environments. Use different workspaces to separate development, staging, and production apps. This prevents accidental access to production data during development and testing.
Access data through appropriate compute. Don't configure your app to access or process data directly. Use SQL warehouses for queries, Model Serving for AI inference, and Lakeflow Jobs for batch processing.
Manage secrets. Never expose raw secret values in environment variables. Use valueFrom in your app config and rotate secrets regularly, especially when team roles change. See Best practices.
Minimize scopes and log user actions. When using user authorization, request only the scopes that your app needs, and log all user actions with structured audit records. See Best practices for user authorization.
Restrict outbound network access. Allow only the domains that your app needs, such as package repositories and external APIs. Use dry-run mode and denial logs to validate your configuration. See Best practices for configuring network policies.
Follow secure coding practices. Parameterize SQL queries to prevent injection attacks and apply general secure development guidelines, such as input validation and error handling. See Statement Execution API: Run SQL on warehouses.
Monitor for suspicious activity. Regularly review audit logs for unusual access patterns or unauthorized actions. Set up alerts for critical security events.

General best practices​

Security best practices​

General best practices

Security best practices