Skip to main content

Set up CI/CD for your Databricks Apps agent

A CI/CD pipeline runs every change to your agent through code review and an automated deploy, so production rollouts don't depend on any one developer's laptop. Once the pipeline is configured, every merge to your main branch deploys and restarts your agent on Databricks Apps.

This page covers the agent-specific pieces. CI/CD for Databricks Apps with GitHub Actions documents the core workflow setup: workload identity federation, the GitHub environment, and the deploy YAML. Complete that page first, then return here for the additions that apply to agent apps.

Requirements

Step 1. Use the starter workflow

Several agent templates in databricks/app-templates ship a ready-to-use .github/workflows/deploy.yml, so you don't have to write the workflow from scratch.

  1. Pick an agent template from databricks/app-templates, such as agent-langgraph or agent-openai-agents-sdk.
  2. In your cloned template directory, check whether .github/workflows/deploy.yml exists.
  3. Set up the workflow:
    • If deploy.yml exists: Open it, confirm the databricks bundle run step references your bundle's resource key from databricks.yml, and follow the prerequisites in the file's header comment.
    • If deploy.yml does not exist: Copy it from a template that does, or from Step 4. Add the deploy workflow. Then update the databricks bundle run <key> step to match your bundle's resource key.

Step 2. Pre-fill the MLflow experiment ID

Agent templates leave MLFLOW_EXPERIMENT_ID empty in databricks.yml. The quickstart script fills it in locally on first setup, but a fresh CI runner does not. If experiment_id is empty, databricks bundle deploy fails with a Terraform type error (For input string: "").

To fix it, commit the populated value:

  1. Run uv run quickstart --profile <your-profile> locally on the machine where you authored the agent.
  2. Verify that the experiment resource in databricks.yml (the entry with name: 'experiment' under resources.apps.<key>.resources) now has a numeric experiment_id.
  3. Commit the change.

The experiment is workspace-scoped, so the same ID is valid for every CI deploy targeting that workspace. If you deploy to multiple workspaces, declare a per-target experiment in databricks.yml (one per targets.<env> block) or use a bundle variable.

Grant Postgres permissions for Lakebase memory templates

The advanced agent templates (agent-langgraph-advanced, agent-openai-advanced) declare an autoscaling Lakebase Postgres resource directly in databricks.yml. With Databricks CLI v0.295.0 and later, databricks bundle deploy provisions the resource alongside the app.

The DAB postgres resource grants the app's Databricks service principal workspace-level access to the Lakebase project, but Lakebase keeps a separate Postgres-role layer for database access (schemas, tables, and sequences). The Databricks service principal needs a Postgres role with the right privileges before the agent can read or write its memory tables. See Authentication architecture for the two-layer model.

Granting these Postgres-level privileges is a one-time setup. Run it locally between the first bundle deploy and bundle run. CI redeploys after that flow through the standard deploy then run path, because the Databricks service principal's Postgres role persists for the lifetime of the app.

  1. Deploy the bundle to provision the Lakebase resource:

    Bash
    databricks bundle deploy --target prod
  2. Grant the Databricks service principal the Postgres-level privileges it needs:

    Bash
    uv run python scripts/grant_lakebase_permissions.py \
    "$(databricks apps get <app-name> --output json | jq -r '.service_principal_client_id')" \
    --memory-type openai \
    --autoscaling-endpoint <endpoint>

    For the LangGraph template, pass --memory-type langgraph. The script also accepts --project <project> --branch <branch> for autoscaling Lakebase, or --instance-name <name> for provisioned Lakebase.

  3. Start the app:

    Bash
    databricks bundle run <bundle-key> --target prod

Step 3. Smoke test the deployed agent

databricks bundle run returns as soon as the runner signals the agent to start, but the agent process may still fail during boot. After the health check from Step 5. Wait for the app to be healthy, add the following smoke-test step to deploy.yml that posts a canary request to /invocations:

YAML
- name: Smoke test invocations
env:
APP_NAME: my-agent
run: |
APP_URL=$(databricks apps get "$APP_NAME" --output json | jq -r '.url')
TOKEN=$(databricks auth token | jq -r '.access_token')
STATUS=$(curl -sS -o /tmp/canary.json -w "%{http_code}" \
-X POST "$APP_URL/invocations" \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"input": [{"role": "user", "content": "ping"}], "stream": false}')
if [ "$STATUS" != "200" ]; then
echo "Smoke test failed with status $STATUS:" >&2
cat /tmp/canary.json >&2
exit 1
fi
echo "Smoke test passed."
note

Databricks Apps only accept OAuth tokens for invocation. Use the workspace OAuth token from databricks auth token; Databricks Apps reject any other token type.

Next steps