GitHub

Deploying Releases

We use a two-stage release process to deploy infrastructure changes safely across all environments. Stage 1 creates a pre-release for review. Stage 2 promotes it through every environment.

This workflow requires membership in the dev-infra-admins team.

Stage 1: Create a Pre-Release

No infrastructure changes are made at this point — the pre-release exists for review only.

Option A: CLI

./zig/zig build release

This verifies the GitHub CLI is authenticated, triggers the “Deploy New Release” workflow, and creates a pre-release with date-based versioning (YYYY.MM.DD).

Option B: GitHub UI

  1. Go to the repository’s Actions tab
  2. Select the “Deploy New Release” workflow
  3. Click “Run workflow” twice to confirm

What Happens

A pre-release is created with auto-generated release notes. Cloud Build links are appended to the notes so you can review the planned changes before promoting.

Stage 2: Promote to Full Release

Once you’ve reviewed the pre-release and its Cloud Build links, convert it to a full release to trigger deployment.

  1. Navigate to the pre-release on GitHub
  2. Click Edit (pencil icon)
  3. Uncheck “Set as a pre-release”
  4. Click “Update release”

This triggers promotion through all environments in sequence:

plan → development → non-production → production → shared

Each promotion uses fast-forward merges only. If any branch has diverged, the workflow stops and requires manual intervention. Cloud Build runs terraform apply on each environment as its branch is updated.

Recovery: Non-Admin Triggered a Release

If a non-admin accidentally converts a pre-release to a full release, the workflow fails at the permission check — no promotion occurs.

A dev-infra-admins member can recover by toggling the release:

  1. Edit the release → check “Set as a pre-release” → save
  2. Edit again → uncheck “Set as a pre-release” → save

This re-fires the release.released event with the admin as the actor.

Branch Promotion Model

Each branch maps to an environment. The release workflow advances changes through them sequentially using git merge --ff-only — no merge commits, strictly linear history.

Branch Environment Purpose
plan Integration branch. All PRs merge here first.
development Development Initial deployment testing
non-production Non-Production Pre-production validation
production Production Live infrastructure
shared Shared Cross-environment resources

Each promotion must succeed before the next begins. If plan → development fails, everything downstream remains unchanged.

Handling Diverged Branches

If someone pushed directly to a downstream branch, the fast-forward merge cannot proceed. You will see:

Cannot proceed: development branch has 2 commit(s) that are not in plan
Process stopped. Please merge or rebase development changes first.

To resolve this, cherry-pick the unexpected commits back to plan and reset the downstream branch:

# Identify the unexpected commits
git log plan..development --oneline

# Cherry-pick them onto plan
git checkout plan
git cherry-pick <commit-sha>

# Reset the downstream branch to match plan
git checkout development
git reset --hard plan
git push origin development --force-with-lease

Then re-run the release promotion.

Monitoring Deployments

After promoting a release, Cloud Build runs terraform apply for each environment.

Accessing Cloud Build

Open the Cloud Build History for the infra pipeline project.

Filter by branch name to find builds for a specific environment:

trigger_name:"non-production"

Expected Build Sequence

Builds appear in this order, with a short delay between each:

  1. development — triggered when plan merges to development
  2. non-production — triggered when development merges to non-production
  3. production — triggered when non-production merges to production
  4. shared — triggered when production merges to shared

If a branch was already in sync, its build does not trigger — there is nothing to apply.

Viewing Terraform Output

Click a build to open its details. Expand the Run terraform apply step. The log shows resources created, modified, or destroyed, plus the final summary:

Apply complete! Resources: 3 added, 1 changed, 0 destroyed.

Troubleshooting Failed Builds

If a build fails:

  1. Check the build logs for the Terraform error
  2. Common causes:
    • Resource conflicts or dependency ordering
    • Permission errors on GCP resources
    • State lock contention (another apply is in progress)
  3. Fix the issue in a new PR to plan and create a new release
Edit this page