Skip to main content

Best practices for Databricks Apps

This section lists important best practices for developing and running Databricks Apps. These guidelines focus on security, performance, and platform requirements.

  • Use Databricks-native features for data processing. App compute is optimized for UI rendering. Use Databricks SQL for queries and datasets, Databricks Jobs for batch processing, and Model Serving for AI inference workloads.

  • Follow secure coding practices. Parameterize SQL queries to prevent injection attacks and apply general secure development guidelines such as input validation and error handling. See the statement execution API for secure query execution methods.

  • Implement graceful shutdown handling. Your app must shut down within 15 seconds after it receives a SIGTERM signal, or it forcibly terminates with SIGKILL.

  • Avoid privileged operations. Apps run as non-privileged users and can’t perform actions that require elevated permissions such as root access.

  • Understand platform-managed networking. Requests are forwarded through a reverse proxy, so your app can’t depend on the origin of requests. Databricks handles TLS termination and requires apps to support HTTP/2 cleartext (H2C). Don’t implement custom TLS handling.

  • Bind to the correct host and port. Your app must listen on 0.0.0.0 and use the port specified in the DATABRICKS_APP_PORT environment variable. See environment variables for details.

  • Minimize container startup time. Keep initialization logic lightweight to reduce cold-start latency. Avoid blocking operations like large dependency installs or external API calls during startup. Load heavy resources only when needed.

  • Log to stdout and stderr. Databricks captures logs from standard output and error streams. Use these for all logging to ensure logs are visible in the Databricks UI. Avoid writing logs to local files.

  • Handle unexpected errors gracefully. Implement global exception handling to prevent crashes from uncaught errors. Return proper HTTP error responses without exposing stack traces or sensitive data.

  • Pin dependency versions. Use exact version numbers in your requirements.txt file to ensure consistent environments across builds. Avoid using unpinned or latest versions of packages.

  • Validate and sanitize user input. Always validate incoming data and sanitize it to prevent injection attacks or malformed inputs, even in internal-facing apps.

  • Use in-memory caching for expensive operations. Cache frequently used data like query results or API responses to reduce latency and avoid redundant processing. Use functools.lru_cache, cachetools, or similar libraries, and scope caches carefully in multi-user apps.