Skip to main content

Productionize your Databricks Apps agent

Once you've authored an agent and deployed it on Databricks Apps, take it to production in this order:

    • 2. Load test your Databricks Apps agent
    • Find the maximum QPS your agent can sustain. Run a ramp-to-saturation load test against a mock-LLM build of your agent to isolate Databricks Apps infrastructure throughput from model latency.
    • 3. Govern LLM usage with Unity AI Gateway
    • Route LLM calls through Unity AI Gateway. Centralize permissions, attribute cost per app, swap models, and inspect or replay traffic without modifying agent code.

For generic Databricks Apps CI/CD that isn't agent-specific, see CI/CD for Databricks Apps with GitHub Actions.