Set up CI/CD for your Databricks Apps agent
A CI/CD pipeline runs every change to your agent through code review and an automated deploy, so production rollouts don't depend on any one developer's laptop. Once the pipeline is configured, every merge to your main branch deploys and restarts your agent on Databricks Apps.
This page covers the agent-specific pieces. CI/CD for Databricks Apps with GitHub Actions documents the core workflow setup: workload identity federation, the GitHub environment, and the deploy YAML. Complete that page first, then return here for the additions that apply to agent apps.
Requirements
- An agent app deployed at least once on Databricks Apps using the OpenAI Agents SDK, LangGraph, or a custom framework. See Author an AI agent and deploy it on Databricks Apps.
- A Databricks service principal with a GitHub Actions federation policy and
CAN MANAGEon the app. See Step 1. Configure workload identity federation. - The Databricks CLI installed and authenticated locally. See Install or update the Databricks CLI.
Step 1. Use the starter workflow
Several agent templates in databricks/app-templates ship a ready-to-use .github/workflows/deploy.yml, so you don't have to write the workflow from scratch.
- Pick an agent template from databricks/app-templates, such as
agent-langgraphoragent-openai-agents-sdk. - In your cloned template directory, check whether
.github/workflows/deploy.ymlexists. - Set up the workflow:
- If
deploy.ymlexists: Open it, confirm thedatabricks bundle runstep references your bundle's resource key fromdatabricks.yml, and follow the prerequisites in the file's header comment. - If
deploy.ymldoes not exist: Copy it from a template that does, or from Step 4. Add the deploy workflow. Then update thedatabricks bundle run <key>step to match your bundle's resource key.
- If
Step 2. Pre-fill the MLflow experiment ID
Agent templates leave MLFLOW_EXPERIMENT_ID empty in databricks.yml. The quickstart script fills it in locally on first setup, but a fresh CI runner does not. If experiment_id is empty, databricks bundle deploy fails with a Terraform type error (For input string: "").
To fix it, commit the populated value:
- Run
uv run quickstart --profile <your-profile>locally on the machine where you authored the agent. - Verify that the experiment resource in
databricks.yml(the entry withname: 'experiment'underresources.apps.<key>.resources) now has a numericexperiment_id. - Commit the change.
The experiment is workspace-scoped, so the same ID is valid for every CI deploy targeting that workspace. If you deploy to multiple workspaces, declare a per-target experiment in databricks.yml (one per targets.<env> block) or use a bundle variable.
Grant Postgres permissions for Lakebase memory templates
The advanced agent templates (agent-langgraph-advanced, agent-openai-advanced) declare an autoscaling Lakebase Postgres resource directly in databricks.yml. With Databricks CLI v0.295.0 and later, databricks bundle deploy provisions the resource alongside the app.
The DAB postgres resource grants the app's Databricks service principal workspace-level access to the Lakebase project, but Lakebase keeps a separate Postgres-role layer for database access (schemas, tables, and sequences). The Databricks service principal needs a Postgres role with the right privileges before the agent can read or write its memory tables. See Authentication architecture for the two-layer model.
Granting these Postgres-level privileges is a one-time setup. Run it locally between the first bundle deploy and bundle run. CI redeploys after that flow through the standard deploy then run path, because the Databricks service principal's Postgres role persists for the lifetime of the app.
-
Deploy the bundle to provision the Lakebase resource:
Bashdatabricks bundle deploy --target prod -
Grant the Databricks service principal the Postgres-level privileges it needs:
Bashuv run python scripts/grant_lakebase_permissions.py \
"$(databricks apps get <app-name> --output json | jq -r '.service_principal_client_id')" \
--memory-type openai \
--autoscaling-endpoint <endpoint>For the LangGraph template, pass
--memory-type langgraph. The script also accepts--project <project> --branch <branch>for autoscaling Lakebase, or--instance-name <name>for provisioned Lakebase. -
Start the app:
Bashdatabricks bundle run <bundle-key> --target prod
Step 3. Smoke test the deployed agent
databricks bundle run returns as soon as the runner signals the agent to start, but the agent process may still fail during boot. After the health check from Step 5. Wait for the app to be healthy, add the following smoke-test step to deploy.yml that posts a canary request to /invocations:
- name: Smoke test invocations
env:
APP_NAME: my-agent
run: |
APP_URL=$(databricks apps get "$APP_NAME" --output json | jq -r '.url')
TOKEN=$(databricks auth token | jq -r '.access_token')
STATUS=$(curl -sS -o /tmp/canary.json -w "%{http_code}" \
-X POST "$APP_URL/invocations" \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"input": [{"role": "user", "content": "ping"}], "stream": false}')
if [ "$STATUS" != "200" ]; then
echo "Smoke test failed with status $STATUS:" >&2
cat /tmp/canary.json >&2
exit 1
fi
echo "Smoke test passed."
Databricks Apps only accept OAuth tokens for invocation. Use the workspace OAuth token from databricks auth token; Databricks Apps reject any other token type.