GitHub

PR Preview Environments

Status: Published Author: DevOps Team Created: 2026-02-25 Updated: 2026-02-25 Jira: HB-7963


Table of Contents

  1. Executive Summary
  2. Features
  3. Architecture Overview
  4. Resource Inventory
  5. Service Account & IAM Design
  6. Secrets & Credentials
  7. Deployment Lifecycle
  8. Cleanup Lifecycle
  9. Image Build Pipeline
  10. Application-Specific Integrations
  11. Naming Conventions & Parameterization
  12. Cost & Resource Constraints
  13. Known Limitations & Future Work
  14. Implementation Guide: Adding PR Previews to a New Application
  15. Appendix: Source File Reference

1. Executive Summary

PR Preview Environments provide ephemeral, per-pull-request deployments of web applications running on GCP Cloud Run. When a developer opens a pull request against main in the Consumer Portal 3.0 (CoPo 3.0) application repository, the system automatically builds fresh Docker images via Cloud Build, provisions an isolated PostgreSQL database, deploys a full three-service application stack as Cloud Run services in the non-production GCP project, and posts a live access URL and QR code directly on the pull request. The environment stays live for as long as the PR is open, receiving updates on every new push, and is fully torn down automatically when the PR is closed or merged.

The infrastructure is deliberately split across two repositories to separate concerns. The infrahive repository manages all foundational GCP resources via Terraform: service accounts, IAM role bindings, Secret Manager secrets, and cross-project permissions. The application repository (the_consumer_portal) owns the operational layer: GitHub Actions workflows that orchestrate the deploy and cleanup lifecycle, and a Cloud Build configuration for building Docker images. This split means the shared infrastructure is provisioned once and reused by every PR, while the application team retains full control over their deployment logic.

Each PR preview deploys a Cloud Run service for each container your application requires. The typical pattern is one service per Dockerfile – for example, a frontend, a backend API, and a reverse proxy (Caddy) if your application uses one. A simpler single-container app would deploy just one Cloud Run service. CoPo 3.0, as the reference implementation, deploys three services: a React frontend, a Spring Boot API, and a Caddy proxy router that serves as the single public entry point, routing /api/* requests to the API and everything else to the frontend. Each PR preview also gets an isolated PostgreSQL database on a shared Cloud SQL instance. All services and the database are created fresh on each PR push and destroyed completely when the PR closes, leaving no persistent state between previews.


2. Features

  • Automatic deployment on PR open, synchronize (new push), and reopen events targeting main
  • Automatic teardown on PR close or merge, removing all Cloud Run services and the PR database
  • Isolated database per PR with a fresh schema created from scratch on every push, eliminating Flyway migration checksum conflicts during active development
  • OAuth callback registration (optional, app-specific): if your application uses an OAuth provider, preview URLs can be dynamically registered as allowed callbacks on deploy and deregistered on cleanup
  • PR comment with access URL posted after deployment, including the proxy URL, a QR code for mobile testing, the commit SHA, and a table of individual service URLs
  • Zero-to-two scaling on all services: instances scale to zero when idle (eliminating cost for unused previews) and scale up to a maximum of two instances under load
  • Commit-SHA tagged images: every push builds fresh Docker images tagged with the exact commit SHA, providing full traceability from a running service back to the source code
  • Parallel image builds: all three Docker images (API, frontend, proxy) are built simultaneously on a high-CPU Cloud Build machine, minimizing total build time
  • Resilient cleanup: all teardown steps use continue-on-error: true so a failure to delete one resource does not block deletion of the remaining resources

3. Architecture Overview

3.1 Full Lifecycle Flow

The following diagram shows the end-to-end lifecycle from a developer pushing code through to deployment and eventual cleanup.

flowchart TD
    Dev[Developer opens or pushes to PR]

    subgraph "the-helper-bees project"
        CB[Cloud Build trigger fires]
        Kaniko[Kaniko builds 3 images in parallel]
        GCR["Images pushed to GCR\ntagged with commit SHA"]
        CB --> Kaniko --> GCR
    end

    subgraph "GitHub Actions - deploy-preview.yml"
        Poll["Poll GitHub Check Runs API\nfor 'the-consumer-portal-docker-images'\nup to 60 tries × 30s"]
        DBSetup["Drop + recreate PR database\ncopo3_pr_${PR_NUM}"]
        DeployAPI["gcloud run deploy\ncopo-pr-${PR_NUM}-api"]
        DeployFE["gcloud run deploy\ncopo-pr-${PR_NUM}-frontend"]
        DeployProxy["gcloud run deploy\ncopo-pr-${PR_NUM}-proxy"]
        Comment["Post PR comment\nURL + QR code + service table"]
    end

    subgraph "prj-bu1-n-pd-infra-fee5 project"
        CloudRun["3 Cloud Run services\napi | frontend | proxy"]
        CloudSQL["Cloud SQL instance\ncopo3-n-psql-050f7fc3\nDatabase: copo3_pr_${PR_NUM}"]
    end

    subgraph "GitHub Actions - cleanup-preview.yml"
        GetURL["Get proxy URL before deletion"]
        DropDB["Drop PR database"]
        DeleteSvcs["Delete api + frontend + proxy\nCloud Run services"]
        CleanComment["Post cleanup confirmation comment"]
    end

    Dev --> CB
    GCR --> Poll
    Poll --> DBSetup
    DBSetup --> DeployAPI --> DeployFE --> DeployProxy
    DeployProxy --> Comment
    DeployAPI --> CloudRun
    DeployFE --> CloudRun
    DeployProxy --> CloudRun
    DeployAPI --> CloudSQL
    Comment -->|PR merged or closed| GetURL
    GetURL --> DropDB --> DeleteSvcs --> CleanComment

3.2 Cross-Project Resource Diagram

GCP resources for PR preview environments span two projects. Images live in the-helper-bees (the shared build project), while runtime resources live in prj-bu1-n-pd-infra-fee5 (the non-production application project).

graph LR
    subgraph "the-helper-bees"
        GCR_API["gcr.io/the-helper-bees/\nthe_consumer_portal/api:SHA"]
        GCR_FE["gcr.io/the-helper-bees/\nthe_consumer_portal/frontend:SHA"]
        GCR_Proxy["gcr.io/the-helper-bees/\nthe_consumer_portal/proxy_router:SHA"]
    end

    subgraph "prj-bu1-n-pd-infra-fee5"
        RunAPI["Cloud Run\ncopo-pr-N-api"]
        RunFE["Cloud Run\ncopo-pr-N-frontend"]
        RunProxy["Cloud Run\ncopo-pr-N-proxy"]
        SQL["Cloud SQL\ncopo3-n-psql-050f7fc3\nDB: copo3_pr_N"]
        DeployerSA["pd-pr-preview-deployer-sa\n(CI/CD identity)"]
        RuntimeSA["pd-pr-preview-runtime-sa\n(Cloud Run identity)"]
        SecretDeployer["Secret Manager\npd-pr-preview-deployer-sa-key"]
        SecretRuntime["Secret Manager\npd-pr-preview-runtime-sa-key"]
        ServiceAgent["Cloud Run Service Agent\nservice-NUMBER@serverless-robot-prod"]
    end

    GCR_API -->|pull on deploy| RunAPI
    GCR_FE -->|pull on deploy| RunFE
    GCR_Proxy -->|pull on deploy| RunProxy

    ServiceAgent -->|roles/artifactregistry.reader| GCR_API
    ServiceAgent -->|roles/artifactregistry.reader| GCR_FE
    ServiceAgent -->|roles/artifactregistry.reader| GCR_Proxy

    DeployerSA -->|roles/artifactregistry.reader cross-project| GCR_API

    RunAPI -->|Cloud SQL Auth Proxy socket| SQL
    RuntimeSA -->|roles/cloudsql.client| SQL
    RunAPI --- RuntimeSA
    RunFE --- RuntimeSA
    RunProxy --- RuntimeSA

    SecretDeployer -->|JSON key| DeployerSA
    SecretRuntime -->|JSON key| RuntimeSA

3.3 Three-Service Routing Architecture

Note: This routing architecture is specific to CoPo 3.0. Your application will look different depending on how it is architected – a single-service app may not need a proxy at all, while a microservices app may have additional backend services behind the proxy.

All external traffic enters through the proxy service on port 30000. The proxy uses Caddy rules to route /api/* requests to the API service and all other requests to the frontend service.

graph TD
    User[Browser / QR Code]
    Proxy["copo-pr-N-proxy\n(Caddy, port 30000)\nPublic URL"]
    Frontend["copo-pr-N-frontend\n(React + Vite + Caddy, port 80)\nInternal URL"]
    API["copo-pr-N-api\n(Spring Boot, port 8080)\nInternal URL"]
    DB["Cloud SQL PostgreSQL\ncopo3_pr_N"]

    User --> Proxy
    Proxy -->|"/api/* routes"| API
    Proxy -->|"all other routes"| Frontend
    API -->|"Cloud SQL Auth Proxy\nUnix socket"| DB

4. Resource Inventory

The table below catalogs every GCP resource involved in the PR preview system, including which project it lives in, its naming pattern, and how long it persists.

GCP Service Resource Project Naming Pattern Lifecycle
Cloud Run API service prj-bu1-n-pd-infra-fee5 copo-pr-{PR_NUM}-api Per PR (deleted on close)
Cloud Run Frontend service prj-bu1-n-pd-infra-fee5 copo-pr-{PR_NUM}-frontend Per PR (deleted on close)
Cloud Run Proxy service prj-bu1-n-pd-infra-fee5 copo-pr-{PR_NUM}-proxy Per PR (deleted on close)
Cloud SQL PR database prj-bu1-n-pd-infra-fee5 copo3_pr_{PR_NUM} Per PR (dropped on close, recreated on push)
Cloud SQL Shared instance prj-bu1-n-pd-infra-fee5 copo3-n-psql-050f7fc3 Permanent (pre-existing)
Cloud SQL Temp admin user prj-bu1-n-pd-infra-fee5 pr_preview_tmp_admin Per push (created and deleted during DB setup)
Container Registry API image the-helper-bees gcr.io/the-helper-bees/the_consumer_portal/api:{SHA} Per commit (accumulates indefinitely)
Container Registry Frontend image the-helper-bees gcr.io/the-helper-bees/the_consumer_portal/frontend:{SHA} Per commit (accumulates indefinitely)
Container Registry Proxy image the-helper-bees gcr.io/the-helper-bees/the_consumer_portal/proxy_router:{SHA} Per commit (accumulates indefinitely)
Secret Manager Deployer SA key prj-bu1-n-pd-infra-fee5 pd-pr-preview-deployer-sa-key Permanent
Secret Manager Runtime SA key prj-bu1-n-pd-infra-fee5 pd-pr-preview-runtime-sa-key Permanent
IAM Deployer service account prj-bu1-n-pd-infra-fee5 pd-pr-preview-deployer-sa Permanent
IAM Runtime service account prj-bu1-n-pd-infra-fee5 pd-pr-preview-runtime-sa Permanent
IAM Cloud Run Service Agent prj-bu1-n-pd-infra-fee5 service-{PROJECT_NUMBER}@serverless-robot-prod.iam.gserviceaccount.com Permanent (GCP-managed)
Cloud Build Build trigger the-helper-bees the-consumer-portal-docker-images (check name) Permanent (externally configured)

5. Service Account & IAM Design

The system uses two custom service accounts: a deployer SA for CI/CD operations and a runtime SA for the running Cloud Run containers. This split follows the principle of least privilege: the deployer SA has elevated permissions needed to provision infrastructure, but the containers themselves run with only the permissions they need at runtime.

5.1 Deployer Service Account

Email: pd-pr-preview-deployer-sa@prj-bu1-n-pd-infra-fee5.iam.gserviceaccount.com Purpose: Used by GitHub Actions to create and manage Cloud Run services, databases, and users during deployment and cleanup.

IAM bindings on prj-bu1-n-pd-infra-fee5:

Role Purpose
roles/run.admin Full control to create, update, and delete Cloud Run services
roles/iam.serviceAccountUser Allows the deployer SA to act as the runtime SA when creating Cloud Run services (required to attach the runtime SA as the Cloud Run service identity)
roles/cloudsql.admin Create and drop PR databases, create and delete temporary database users, grant schema privileges

Cross-project IAM bindings on the-helper-bees:

Role Purpose
roles/artifactregistry.reader Allows the deployer SA to verify that images exist in GCR before initiating a Cloud Run deployment

5.2 Runtime Service Account

Email: pd-pr-preview-runtime-sa@prj-bu1-n-pd-infra-fee5.iam.gserviceaccount.com Purpose: Attached to all three Cloud Run services as their workload identity. Grants only what the running containers need.

IAM bindings on prj-bu1-n-pd-infra-fee5:

Role Purpose
roles/cloudsql.client Allows the Cloud SQL Auth Proxy sidecar (running inside the Cloud Run service) to establish authenticated connections to the Cloud SQL instance

5.3 Cloud Run Service Agent

GCP automatically provisions a Cloud Run Service Agent (service-{PROJECT_NUMBER}@serverless-robot-prod.iam.gserviceaccount.com) for every project. This agent performs the actual container image pull during deployment. Because images live in a different project (the-helper-bees), the Service Agent needs explicit cross-project permissions.

Cross-project IAM bindings on the-helper-bees:

Role Resource Purpose
roles/artifactregistry.reader Project the-helper-bees Pull images from Artifact Registry / GCR during Cloud Run deployment
roles/storage.objectViewer Bucket artifacts.the-helper-bees.appspot.com Pull images from legacy GCR storage bucket (required for gcr.io/ image paths)

5.4 Deployer vs. Runtime Split Rationale

The key design decision is that the deployer SA’s JSON key is stored as a GitHub Actions secret and is therefore exposed (in a limited, audited way) to the CI/CD pipeline. If the deployer SA also had roles/cloudsql.client, a compromised CI system could directly query production databases. By keeping the runtime SA separate and only granting it roles/cloudsql.client, the blast radius of a compromised deployer key is limited to infrastructure management operations.

Conversely, the runtime SA never needs Cloud Run admin permissions. A compromised container should not be able to redeploy itself or delete other services.

5.5 Terraform Source

All service accounts and IAM bindings above are managed in:

infra/pd-infra/business_unit_1/non-production/pr_preview_deployer.tf

6. Secrets & Credentials

6.1 GitHub Actions Secrets

These secrets are stored in the the_consumer_portal repository’s GitHub Actions secret store and must be configured before the workflows can run.

Secret Name Used In Purpose
GCP_PR_PREVIEW_DEPLOYER_SA_KEY deploy-preview.yml, cleanup-preview.yml JSON key for the deployer service account. Used with google-github-actions/auth to authenticate all GCP API calls in the workflow.
AUTH0_MANAGEMENT_CLIENT_SECRET deploy-preview.yml, cleanup-preview.yml (CoPo 3.0-specific, not required by the generic architecture.) OAuth client secret for obtaining a Management API token from Auth0. Used to register and deregister callback URLs.
AUTH0_APP_CLIENT_ID deploy-preview.yml, cleanup-preview.yml (CoPo 3.0-specific, not required by the generic architecture.) The Auth0 client ID of the CoPo 3.0 application whose callback URLs are modified.

6.2 GCP Secret Manager Secrets

These secrets are created as empty shells by Terraform in pr_preview_deployer.tf and populated by the scripts/create-pr-preview-sa-keys.sh helper script after Terraform apply.

Secret Name Project Purpose How Populated
pd-pr-preview-deployer-sa-key prj-bu1-n-pd-infra-fee5 JSON key for the deployer SA. The content of this secret is also what gets uploaded to GitHub Actions as GCP_PR_PREVIEW_DEPLOYER_SA_KEY. create-pr-preview-sa-keys.sh
pd-pr-preview-runtime-sa-key prj-bu1-n-pd-infra-fee5 JSON key for the runtime SA. Currently stored for reference but not directly used by the GitHub Actions workflows (the runtime SA is referenced by email, not by key). create-pr-preview-sa-keys.sh

6.3 Hardcoded Values in Workflows

The following values are hardcoded directly in the GitHub Actions workflow files rather than stored as secrets or variables. They are not sensitive credentials but should be moved to GitHub Actions variables for maintainability.

Value Location Notes
Auth0 Management API client ID (hkvjiaLpTy89kwuBOw3uVjZDWzNlnsoY) Both workflows (CoPo 3.0-specific.) Client ID for obtaining the Management API token. Not the same as AUTH0_APP_CLIENT_ID. Not a secret by itself (requires AUTH0_MANAGEMENT_CLIENT_SECRET to be useful) but hardcoded for convenience.
Auth0 domain (copo-3-dev.us.auth0.com) Both workflows (CoPo 3.0-specific.) The Auth0 tenant domain. Should become a GitHub Actions variable.
Stripe test publishable key (pk_test_...) build/cloudbuild.yml A public test key (safe to expose) used as a frontend build argument. Listed here for completeness.

6.4 Application Secrets Referenced at Runtime

The API Cloud Run service references these secrets from Secret Manager at startup via the --set-secrets flag. These are pre-existing application secrets, not created by the PR preview system.

Secret Reference Secret Manager Path Purpose
VAULT_KEY_SECRET projects/658657680596/secrets/hbcp-copo3-vault-key Vault encryption key for the CoPo 3.0 application
VAULT_SECRET projects/945498299260/secrets/hbcp-copo3-vault Vault secrets store for the CoPo 3.0 application

6.5 Rotation Considerations

The deployer SA’s JSON key (pd-pr-preview-deployer-sa-key) is the highest-risk credential in this system because it grants Cloud Run and Cloud SQL admin-level access. Rotation requires:

  1. Run create-pr-preview-sa-keys.sh to generate a new key and store it in Secret Manager
  2. Update the GCP_PR_PREVIEW_DEPLOYER_SA_KEY GitHub Actions secret with the new key content
  3. Delete the old key version from Secret Manager and from the GCP IAM key list

The system has no automated rotation. This is a known limitation; see Section 13 for the Workload Identity Federation upgrade path that would eliminate key management entirely.


7. Deployment Lifecycle

The deploy-preview.yml workflow is the core of the system. It runs on pull_request events with types [opened, synchronize, reopened] targeting main.

7.1 Environment Variables

The following environment variables are set at the workflow level and are available to all steps:

env:
  PROJECT_ID: prj-bu1-n-pd-infra-fee5
  REGION: us-central1
  IMAGE_REPO: gcr.io/the-helper-bees/the_consumer_portal
  RUNTIME_SA: pd-pr-preview-runtime-sa@prj-bu1-n-pd-infra-fee5.iam.gserviceaccount.com
  CLOUD_SQL_INSTANCE: prj-bu1-n-pd-infra-fee5:us-central1:copo3-n-psql-050f7fc3
  CLOUD_SQL_INSTANCE_SHORT: copo3-n-psql-050f7fc3

7.2 Step-by-Step Sequence

flowchart TD
    S1["1. Checkout repository"] --> S2
    S2["2. Authenticate to GCP\n(GCP_PR_PREVIEW_DEPLOYER_SA_KEY)"] --> S3
    S3["3. Set up Cloud SDK\n(install cloud-sql-proxy component)"] --> S4
    S4["4. Generate resource names\nfrom github.event.pull_request.number"] --> S5
    S5{"5. Poll for Docker images\nCheck Runs API\n60 tries × 30s = 30 min max"}
    S5 -->|"check = success"| S6
    S5 -->|"timeout"| FAIL[Workflow fails]
    S6["6. Delete existing API service\n(force-release DB connections)\nsleep 10s for drain"] --> S7
    S7["7. Create PR database\n- Drop existing copo3_pr_N\n- Create fresh database\n- Create temp admin user\n- GRANT CREATE ON SCHEMA public to hbcp_copo3_user\n- Delete temp admin user"] --> S8
    S8["8. Deploy API Cloud Run service\n1 CPU / 512Mi / 0-2 instances\nSpring staging profile\nCloud SQL attached"] --> S9
    S9["9. Deploy Frontend Cloud Run service\nport 80 / 1 CPU / 256Mi / 0-2 instances"] --> S10
    S10["10. Deploy Proxy Cloud Run service\nport 30000\nAPI_UPSTREAM + FRONTEND_UPSTREAM env vars"] --> S11
    S11["11. Post PR comment\nProxy URL + QR code + commit SHA + service table"] --> S12
    S12["12. Write job summary\nto GITHUB_STEP_SUMMARY"]

7.3 Step Details

Step 4 - Name generation. All resource names derive from the PR number, set as shell variables early in the job:

PR_NUM="${{ github.event.pull_request.number }}"
API_SERVICE="copo-pr-${PR_NUM}-api"
FRONTEND_SERVICE="copo-pr-${PR_NUM}-frontend"
PROXY_SERVICE="copo-pr-${PR_NUM}-proxy"
DB_NAME="copo3_pr_${PR_NUM}"

Step 5 - Image polling. The workflow cannot deploy until Cloud Build finishes building the images for the current commit. The workflow polls the GitHub Check Runs API using the commit SHA and looks for a check named the-consumer-portal-docker-images with a conclusion of success. It retries up to 60 times with a 30-second interval (maximum 30 minutes):

for i in $(seq 1 60); do
  CONCLUSION=$(gh api \
    /repos/${{ github.repository }}/commits/${{ github.sha }}/check-runs \
    --jq '.check_runs[] | select(.name == "the-consumer-portal-docker-images") | .conclusion' \
    2>/dev/null | head -1)
  if [ "$CONCLUSION" = "success" ]; then break; fi
  sleep 30
done

Step 6 - Delete existing API service. On a PR push (synchronize event), the API service from the previous push may still be running with active database connections. Deleting it first ensures connections are released before the database is dropped in step 7. A 10-second sleep provides a drain window:

gcloud run services delete "${API_SERVICE}" \
  --project="${PROJECT_ID}" \
  --region="${REGION}" \
  --quiet 2>/dev/null || true
sleep 10

Step 7 - Database creation. The database is dropped and recreated on every push (not just when the PR first opens). This ensures the schema is always fresh and avoids Flyway migration checksum conflicts when a developer amends or rewrites migrations during active development. A temporary admin user is created just long enough to grant schema creation privileges to the application user:

# Drop database if it exists
gcloud sql databases delete "${DB_NAME}" \
  --instance="${CLOUD_SQL_INSTANCE_SHORT}" \
  --project="${PROJECT_ID}" --quiet 2>/dev/null || true

# Create fresh database
gcloud sql databases create "${DB_NAME}" \
  --instance="${CLOUD_SQL_INSTANCE_SHORT}" \
  --project="${PROJECT_ID}"

# Grant schema privileges via temporary admin user
gcloud sql users create pr_preview_tmp_admin \
  --instance="${CLOUD_SQL_INSTANCE_SHORT}" \
  --project="${PROJECT_ID}" \
  --password="$(openssl rand -base64 32)"

# Connect via cloud-sql-proxy and run GRANT statement
# Then delete the temp user
gcloud sql users delete pr_preview_tmp_admin \
  --instance="${CLOUD_SQL_INSTANCE_SHORT}" \
  --project="${PROJECT_ID}" --quiet

Step 8 - Deploy API service. The API is a Spring Boot application deployed with a staging profile. It connects to Cloud SQL via the Cloud SQL Auth Proxy using a Unix socket path. Application-specific environment variables and Secret Manager references are injected here:

gcloud run deploy "${API_SERVICE}" \
  --image="${IMAGE_REPO}/api:${GITHUB_SHA}" \
  --project="${PROJECT_ID}" \
  --region="${REGION}" \
  --service-account="${RUNTIME_SA}" \
  --add-cloudsql-instances="${CLOUD_SQL_INSTANCE}" \
  --set-env-vars="SPRING_PROFILES_ACTIVE=staging" \
  --set-env-vars="SPRING_DATASOURCE_URL=jdbc:postgresql://localhost/${DB_NAME}?socketFactory=com.google.cloud.sql.postgres.SocketFactory&socketFactoryArg=/cloudsql/${CLOUD_SQL_INSTANCE}" \
  # Add your application-specific environment variables and secrets below
  --set-env-vars="YOUR_APP_ENV_VAR=value" \
  --cpu=1 \
  --memory=512Mi \
  --min-instances=0 \
  --max-instances=2 \
  --labels="pr-number=${PR_NUM}" \
  --allow-unauthenticated \
  --port=8080

Step 9 - Deploy Frontend service. The frontend is a static React build served by Caddy on port 80:

gcloud run deploy "${FRONTEND_SERVICE}" \
  --image="${IMAGE_REPO}/frontend:${GITHUB_SHA}" \
  --project="${PROJECT_ID}" \
  --region="${REGION}" \
  --service-account="${RUNTIME_SA}" \
  --cpu=1 \
  --memory=256Mi \
  --min-instances=0 \
  --max-instances=2 \
  --labels="pr-number=${PR_NUM}" \
  --allow-unauthenticated \
  --port=80

Step 10 - Deploy Proxy service. The proxy Caddy instance receives the internal URLs of the API and frontend services as environment variables and configures its routing rules from them:

API_URL=$(gcloud run services describe "${API_SERVICE}" \
  --project="${PROJECT_ID}" --region="${REGION}" \
  --format="value(status.url)")
FRONTEND_URL=$(gcloud run services describe "${FRONTEND_SERVICE}" \
  --project="${PROJECT_ID}" --region="${REGION}" \
  --format="value(status.url)")

gcloud run deploy "${PROXY_SERVICE}" \
  --image="${IMAGE_REPO}/proxy_router:${GITHUB_SHA}" \
  --project="${PROJECT_ID}" \
  --region="${REGION}" \
  --service-account="${RUNTIME_SA}" \
  --set-env-vars="API_UPSTREAM=${API_URL}" \
  --set-env-vars="FRONTEND_UPSTREAM=${FRONTEND_URL}" \
  --cpu=1 \
  --memory=256Mi \
  --min-instances=0 \
  --max-instances=2 \
  --labels="pr-number=${PR_NUM}" \
  --allow-unauthenticated \
  --port=30000

Step 11 - PR comment. After all services are deployed, the workflow posts a comment on the PR with the proxy URL formatted as a clickable link, a QR code image (generated from a QR code API), the commit SHA, and a table showing the individual API and frontend URLs for debugging.

Step 12 - Job summary. The same information is written to $GITHUB_STEP_SUMMARY so it appears in the GitHub Actions run summary without requiring the reviewer to open the PR.


8. Cleanup Lifecycle

The cleanup-preview.yml workflow runs on pull_request events with type [closed] targeting main. It tears down all resources regardless of whether the PR was merged or simply closed.

8.1 Step-by-Step Sequence

  1. Authenticate to GCP using the same GCP_PR_PREVIEW_DEPLOYER_SA_KEY secret
  2. Set up Cloud SDK
  3. Generate resource names from github.event.pull_request.number (same logic as deploy)
  4. Retrieve proxy URL before deletion — any OAuth/IdP deregistration step (if applicable) needs the proxy URL before the proxy service is deleted. The URL is captured first:
    PROXY_URL=$(gcloud run services describe "${PROXY_SERVICE}" \
      --project="${PROJECT_ID}" --region="${REGION}" \
      --format="value(status.url)" 2>/dev/null || echo "")
  5. Remove OAuth/IdP callback URLs (if applicable) — if the application uses an OAuth provider that requires callback URL registration, make the appropriate API call to deregister the proxy URL before service deletion
  6. Drop the PR database using gcloud sql databases delete
  7. Delete the API Cloud Run service (continue-on-error: true)
  8. Delete the Frontend Cloud Run service (continue-on-error: true)
  9. Delete the Proxy Cloud Run service (continue-on-error: true)
  10. Post cleanup confirmation comment on the PR

8.2 Resilience Strategy

Every deletion step in the cleanup workflow uses continue-on-error: true. This is intentional: if the API service was already deleted (for example, due to a previous failed deployment), the step should not block deletion of the frontend and proxy services. Without this flag, a single “service not found” error would leave the remaining resources as orphans.

The cleanup comment is posted at the end regardless of individual step failures, giving the developer visibility into whether cleanup completed. If a resource was not deleted (for example, because OAuth/IdP deregistration failed), it can be cleaned up manually by re-running the cleanup workflow or by running the gcloud commands directly.

8.3 Order of Operations Rationale

The cleanup order matters for correctness:

  • The proxy URL is captured before any deletion, because the proxy URL is needed for any OAuth/IdP deregistration call
  • OAuth/IdP callbacks are removed (if applicable) before services are deleted, ensuring the OAuth/IdP client is always left in a consistent state even if service deletion fails
  • The database is dropped before the API service is deleted (in the cleanup case), because once the database is gone, any residual API connections will fail — this is acceptable because the PR is already closed

9. Image Build Pipeline

9.1 Cloud Build Configuration

Images are built by a Cloud Build trigger configured in the the-helper-bees GCP project. The trigger fires on pushes to the main branch of the the_consumer_portal repository (and on PR commits via a connected trigger). The build is defined in build/cloudbuild.yml.

Key configuration parameters:

Parameter Value
Machine type E2_HIGHCPU_32 (32 vCPUs, optimized for parallel compilation)
Timeout 1200s (20 minutes)
Build tool Kaniko (gcr.io/kaniko-project/executor)
Cache TTL 72h (72-hour layer cache for faster incremental builds)
Image tag $COMMIT_SHA (the Git commit SHA of the triggering push)

9.2 Parallel Build Strategy

All three images are built in parallel by setting waitFor: ["-"] on each step. This means Cloud Build starts all three Kaniko executors simultaneously rather than sequentially. On a 32-vCPU machine, this reduces total build time significantly compared to building in series.

steps:
  - name: gcr.io/kaniko-project/executor
    id: build-api
    waitFor: ["-"]
    args:
      - --dockerfile=compose/api/Dockerfile
      - --context=.
      - --destination=gcr.io/the-helper-bees/the_consumer_portal/api:$COMMIT_SHA
      - --cache=true
      - --cache-ttl=72h

  - name: gcr.io/kaniko-project/executor
    id: build-frontend
    waitFor: ["-"]
    args:
      - --dockerfile=compose/frontend/Dockerfile
      - --context=.
      - --destination=gcr.io/the-helper-bees/the_consumer_portal/frontend:$COMMIT_SHA
      - --build-arg=VITE_API_URL=
      - --build-arg=VITE_STRIPE_PUBLISHABLE_KEY={STRIPE_TEST_KEY}
      - --cache=true
      - --cache-ttl=72h

  - name: gcr.io/kaniko-project/executor
    id: build-proxy
    waitFor: ["-"]
    args:
      - --dockerfile=compose/proxy_router/Dockerfile
      - --context=.
      - --destination=gcr.io/the-helper-bees/the_consumer_portal/proxy_router:$COMMIT_SHA
      - --cache=true
      - --cache-ttl=72h

9.3 Trigger Configuration

The Cloud Build trigger is not managed as code (see Section 13, Limitation 3). It is configured manually in the GCP console under the the-helper-bees project. The trigger:

  • Connects to the the_consumer_portal GitHub repository
  • Fires on push events (both to main and on PR commits)
  • Reports its result as a GitHub Check named the-consumer-portal-docker-images

This check name is the exact string the deploy-preview.yml workflow polls for in Step 5 (see Section 7.3).

9.4 Deploy Workflow Integration

The deploy workflow does not trigger Cloud Build — Cloud Build fires independently on the git push event. The deploy workflow’s job is to wait until Cloud Build finishes before proceeding to deployment. This polling approach ensures the workflow fails informatively (rather than deploying a stale image) if Cloud Build is slow or fails.

If the Cloud Build check does not reach success within 30 minutes (60 polls × 30 seconds), the deploy workflow fails with a timeout error, leaving the existing preview environment (from the previous push) intact.


10. Application-Specific Integrations

Some applications require additional setup steps during preview deployment. For example, CoPo 3.0 registers the preview proxy URL as an allowed callback, logout, and web origin URL with its Auth0 OAuth provider during deployment, and removes it during cleanup. If your application uses an OAuth/IdP provider that requires callback URL registration, add similar steps to your deploy and cleanup workflows.


11. Naming Conventions & Parameterization

The pull request number (PR_NUM) is the primary identifier that flows through every resource name in the system. Using the PR number (rather than, for example, the commit SHA or branch name) has several advantages: it is stable across pushes to the same PR, it is short and human-readable, and it is unique within a repository.

Resource Pattern Example (PR #42)
API Cloud Run service {app-prefix}-pr-{PR_NUM}-api copo-pr-42-api
Frontend Cloud Run service {app-prefix}-pr-{PR_NUM}-frontend copo-pr-42-frontend
Proxy Cloud Run service {app-prefix}-pr-{PR_NUM}-proxy copo-pr-42-proxy
PostgreSQL database {db-prefix}_pr_{PR_NUM} copo3_pr_42
Docker image tag {commit-sha} a3f2c1d8... (full 40-char SHA)
Cloud Run service label pr-number={PR_NUM} pr-number=42
Cloud Build check name {app}-docker-images the-consumer-portal-docker-images

Note that Cloud Run service names use hyphens (following GCP’s naming constraints) while database names use underscores (following PostgreSQL identifier rules). Both follow the same {prefix}-pr-{N} or {prefix}_pr_{N} structure.

The app prefix (copo for services, copo3 for databases) is the application-specific identifier. When adopting this pattern for a new application, choose a short, stable identifier and use it consistently across all resource names.


12. Cost & Resource Constraints

12.1 Per-Service Resource Limits

Service CPU Memory Min Instances Max Instances
API 1 vCPU 512Mi 0 2
Frontend 1 vCPU 256Mi 0 2
Proxy 1 vCPU 256Mi 0 2

All services use --min-instances=0, meaning they scale to zero when no requests are being handled. A preview environment that no one is actively using incurs no Cloud Run compute cost. The first request after scale-down incurs a cold start latency of 1-5 seconds depending on the service.

12.2 Zero-Scaling Cost Model

Cloud Run billing is based on CPU and memory consumed during request processing only (when min-instances=0). A preview environment that receives 100 requests per day and handles each in 500ms would consume:

  • API: ~0.014 vCPU-hours + ~7Mi-hours of memory per day
  • At GCP Cloud Run pricing, this is effectively negligible (fractions of a cent per day)

The dominant cost for idle preview environments is the Cloud SQL instance, which runs continuously regardless of PR activity. Since all previews share a single Cloud SQL instance (copo3-n-psql-050f7fc3), this cost is shared and not attributable to individual previews.

12.3 Shared Cloud SQL Constraints

All PR preview databases live on the same Cloud SQL PostgreSQL instance. The practical limits are:

Constraint Implication
Cloud SQL max connections Each Cloud Run instance opens connections via the Auth Proxy. With max 2 instances per service × N PRs open, connection count scales linearly with active PRs.
Disk space Each PR database accumulates data from Flyway migrations and any test data. Disk usage grows with the number of concurrent PRs.
Single instance availability If the Cloud SQL instance goes down, all preview environments go down simultaneously.

As a practical guideline, the system can comfortably support 10-20 concurrent open PRs before connection pool pressure on the shared Cloud SQL instance becomes a concern. Beyond that, consider either increasing the Cloud SQL max connections limit or moving to per-PR Cloud SQL instances (at higher cost).

12.4 Rough Cost Estimate Per Preview Environment

Assuming a preview environment that is actively used for 4 hours per day over a 5-day PR lifecycle:

Resource Estimated Cost
Cloud Run compute (3 services, 4h/day, 5 days) ~$0.10-0.30
Cloud SQL storage increment (per-PR schema data) ~$0.01
Cloud Build build time (E2_HIGHCPU_32, ~8 min per push, 5 pushes) ~$0.15-0.25
Total per PR lifecycle ~$0.25-0.55

These estimates are approximate and assume GCP us-central1 pricing. The Cloud SQL instance itself is a shared cost across all teams using the non-production environment.


13. Known Limitations & Future Work

The following limitations are acknowledged and represent potential future improvements.

  1. No TTL or auto-cleanup mechanism. Orphaned preview environments — from PRs that were force-closed, abandoned, or had failed cleanup workflows — persist indefinitely and require manual deletion. A future improvement could use Cloud Scheduler to run a periodic job that lists all Cloud Run services with pr-number labels, checks whether the corresponding PR is still open via the GitHub API, and deletes resources for closed PRs.

  2. No Workload Identity Federation (WIF). The system uses JSON key-based service account authentication. JSON keys are long-lived credentials that require manual rotation and carry a higher security risk than short-lived tokens. Migrating to WIF would allow GitHub Actions to authenticate to GCP using OIDC tokens without any stored key material. See the GCP WIF documentation for migration guidance.

  3. Cloud Build trigger not managed as code. The Cloud Build trigger connecting GitHub pushes to the image build pipeline is configured manually in the GCP console. It is not represented in Terraform or any config file, making it invisible to code review and prone to configuration drift. The trigger should be imported into Terraform using the google_cloudbuild_trigger resource.

  4. No image cleanup in GCR. Docker images tagged with commit SHAs accumulate in gcr.io/the-helper-bees/the_consumer_portal indefinitely. Over time this increases storage costs and makes the registry harder to navigate. A GCR lifecycle policy or a Cloud Scheduler job to delete images older than a threshold (e.g., 30 days) should be added.

  5. No end-to-end tests against preview environments. The current test suite runs against locally-built applications. Adding a job to the deploy-preview.yml workflow that runs a smoke test suite (for example, using Playwright or a simple curl-based healthcheck) against the deployed proxy URL would catch deployment-specific failures early.

  6. Shared Cloud SQL instance connection limits. See Section 12.3. A large number of concurrent open PRs can exhaust connection limits on the shared instance. Future work could introduce PgBouncer connection pooling or move high-volume teams to dedicated Cloud SQL instances.

  7. Shared database user (hbcp_copo3_user). All preview databases are accessed by the same PostgreSQL application user. There is no per-PR user isolation. A compromised or buggy PR deployment could theoretically affect data in other PR databases if the user has cross-database permissions. Per-PR database users would provide stronger isolation.

  8. (CoPo 3.0) OAuth callback URL accumulation if cleanup fails. If the cleanup workflow’s OAuth/IdP deregistration step fails, stale callback URLs remain in the provider’s client configuration. These accumulate over time and must be pruned manually. For CoPo 3.0 specifically, an Auth0 Management API call to list and remove all URLs matching the preview pattern (https://copo-pr-*) could be added as a periodic maintenance task.

  9. roles/cloudsql.admin is overly broad on the deployer SA. This role grants the deployer SA the ability to modify or delete any database on the Cloud SQL instance, not just PR preview databases. A future improvement would restrict the deployer SA to only the databases it owns, using IAM conditions or a more narrowly scoped custom role.


14. Implementation Guide: Adding PR Previews to a New Application

This section is a step-by-step guide for a team that wants to add PR preview environments to their own application. Replace all {PLACEHOLDER} values with your application-specific values throughout.

Placeholder Meaning Example
{APP_NAME} Short application identifier (lowercase, hyphens) myapp
{APP_PREFIX} Database identifier (may differ from APP_NAME) myapp
{PROJECT_ID} GCP project ID where Cloud Run will run prj-bu1-n-myapp-infra-abc1
{REGION} GCP region for Cloud Run and Cloud SQL us-central1
{IMAGE_REPO} Base path for Docker images gcr.io/the-helper-bees/{APP_NAME}
{CLOUD_SQL_INSTANCE} Full Cloud SQL instance name prj-bu1-n-myapp-infra-abc1:us-central1:myapp-n-psql-abc123
{CLOUD_SQL_INSTANCE_SHORT} Cloud SQL instance short name myapp-n-psql-abc123
{IMAGE_PROJECT} GCP project where images are stored the-helper-bees
{CHECK_NAME} Cloud Build check name in GitHub myapp-docker-images

14.1 Prerequisites

Before starting, confirm you have the following:

If your application does not use an OAuth/IdP provider that requires callback URL registration, skip the OAuth callback registration steps in Phase 3-4. See Section 10 for the pattern used by CoPo 3.0 as a reference example.


14.2 Phase 1: Infrastructure (Terraform)

Add a new Terraform file to the infrahive repository at:

infra/{infra-dir}/business_unit_1/non-production/{APP_NAME}_pr_preview.tf
# =============================================================================
# PR Preview Service Accounts for {APP_NAME}
# =============================================================================
# Deployer SA: used by GitHub Actions to manage Cloud Run and Cloud SQL
# Runtime SA:  attached to Cloud Run containers as workload identity
# =============================================================================

# -----------------------------------------------------------------------------
# Deployer Service Account
# -----------------------------------------------------------------------------
resource "google_service_account" "{APP_NAME}_pr_preview_deployer" {
  project      = data.google_project.env_project.project_id
  account_id   = "{APP_NAME}-pr-preview-deployer-sa"
  display_name = "{APP_NAME} PR Preview Deployer Service Account"
  description  = "Service account for GitHub Actions to deploy Cloud Run PR preview environments for {APP_NAME}"
}

locals {
  {APP_NAME}_pr_preview_deployer_roles = [
    "roles/run.admin",
    "roles/iam.serviceAccountUser",
    "roles/cloudsql.admin",  # Remove if app does not need per-PR databases
  ]

  {APP_NAME}_pr_preview_runtime_roles = [
    "roles/cloudsql.client",  # Remove if app does not use Cloud SQL
  ]
}

resource "google_project_iam_member" "{APP_NAME}_pr_preview_deployer_roles" {
  for_each = toset(local.{APP_NAME}_pr_preview_deployer_roles)
  project  = data.google_project.env_project.project_id
  role     = each.key
  member   = "serviceAccount:${google_service_account.{APP_NAME}_pr_preview_deployer.email}"
}

# -----------------------------------------------------------------------------
# Cross-project: allow deployer to verify images in the image registry project
# -----------------------------------------------------------------------------
resource "google_project_iam_member" "{APP_NAME}_pr_preview_deployer_artifact_registry" {
  project = "{IMAGE_PROJECT}"
  role    = "roles/artifactregistry.reader"
  member  = "serviceAccount:${google_service_account.{APP_NAME}_pr_preview_deployer.email}"
}

# -----------------------------------------------------------------------------
# Cloud Run Service Agent: allow GCP to pull images from the image project
# This is the agent that actually pulls images during gcloud run deploy.
# -----------------------------------------------------------------------------
resource "google_project_iam_member" "{APP_NAME}_cloud_run_service_agent_artifact_registry" {
  project = "{IMAGE_PROJECT}"
  role    = "roles/artifactregistry.reader"
  member  = "serviceAccount:service-${data.google_project.env_project.number}@serverless-robot-prod.iam.gserviceaccount.com"
}

# Required for legacy gcr.io/ image paths (GCR stores images in GCS buckets)
resource "google_storage_bucket_iam_member" "{APP_NAME}_cloud_run_service_agent_gcr" {
  bucket = "artifacts.{IMAGE_PROJECT}.appspot.com"
  role   = "roles/storage.objectViewer"
  member = "serviceAccount:service-${data.google_project.env_project.number}@serverless-robot-prod.iam.gserviceaccount.com"
}

# Omit the GCS bucket binding if using Artifact Registry (ar.io/) URLs instead of GCR (gcr.io/) URLs.

# -----------------------------------------------------------------------------
# Runtime Service Account
# -----------------------------------------------------------------------------
resource "google_service_account" "{APP_NAME}_pr_preview_runtime" {
  project      = data.google_project.env_project.project_id
  account_id   = "{APP_NAME}-pr-preview-runtime-sa"
  display_name = "{APP_NAME} PR Preview Runtime Service Account"
  description  = "Service account attached to Cloud Run PR preview containers for {APP_NAME}"
}

resource "google_project_iam_member" "{APP_NAME}_pr_preview_runtime_cloudsql" {
  for_each = toset(local.{APP_NAME}_pr_preview_runtime_roles)
  project  = data.google_project.env_project.project_id
  role     = each.key
  member   = "serviceAccount:${google_service_account.{APP_NAME}_pr_preview_runtime.email}"
}

# -----------------------------------------------------------------------------
# Secret Manager: shells for SA keys (populate with create-pr-preview-sa-keys.sh)
# -----------------------------------------------------------------------------
resource "google_secret_manager_secret" "{APP_NAME}_pr_preview_deployer_sa_key" {
  project   = data.google_project.env_project.project_id
  secret_id = "{APP_NAME}-pr-preview-deployer-sa-key"

  replication {
    auto {}
  }
}

resource "google_secret_manager_secret" "{APP_NAME}_pr_preview_runtime_sa_key" {
  project   = data.google_project.env_project.project_id
  secret_id = "{APP_NAME}-pr-preview-runtime-sa-key"

  replication {
    auto {}
  }
}

# -----------------------------------------------------------------------------
# Outputs (add to outputs.tf in the same directory)
# -----------------------------------------------------------------------------
# output "{APP_NAME}_pr_preview_deployer_sa_email" {
#   value = google_service_account.{APP_NAME}_pr_preview_deployer.email
# }
# output "{APP_NAME}_pr_preview_deployer_sa_key_secret_id" {
#   value = google_secret_manager_secret.{APP_NAME}_pr_preview_deployer_sa_key.secret_id
# }
# output "{APP_NAME}_pr_preview_runtime_sa_email" {
#   value = google_service_account.{APP_NAME}_pr_preview_runtime.email
# }

After writing the Terraform file, apply it:

# From the infrahive root
./zig/zig build plan -- {infra-dir} non-production
./zig/zig build apply -- {infra-dir} non-production

14.3 Phase 2: Cloud Build Configuration

Add or update build/cloudbuild.yml in your application repository. Adjust the steps for your application’s actual Docker images:

# build/cloudbuild.yml
# Cloud Build configuration for PR preview image builds.
# Triggered by: Cloud Build trigger connected to GitHub (configured in GCP console).
# GitHub Check name reported: {CHECK_NAME}

timeout: "1200s"

options:
  machineType: E2_HIGHCPU_32
  logging: CLOUD_LOGGING_ONLY

steps:
  # Build each image in parallel using Kaniko.
  # waitFor: ["-"] means "start immediately, don't wait for any prior step".
  # Adjust --dockerfile and --destination paths for your application.

  - name: gcr.io/kaniko-project/executor
    id: build-api
    waitFor: ["-"]
    args:
      - --dockerfile=path/to/api/Dockerfile
      - --context=.
      - --destination={IMAGE_REPO}/api:$COMMIT_SHA
      - --cache=true
      - --cache-ttl=72h

  - name: gcr.io/kaniko-project/executor
    id: build-frontend
    waitFor: ["-"]
    args:
      - --dockerfile=path/to/frontend/Dockerfile
      - --context=.
      - --destination={IMAGE_REPO}/frontend:$COMMIT_SHA
      # Add --build-arg entries for your frontend environment variables
      - --cache=true
      - --cache-ttl=72h

  # Add or remove image build steps based on your application's service count.
  # A simple single-service app may only need one step here.

If your application uses only a single Docker image (no proxy needed), remove the extra steps and simplify the deploy workflow accordingly.


14.4 Phase 3: Deploy Workflow

Create .github/workflows/deploy-preview.yml in your application repository:

# .github/workflows/deploy-preview.yml
# Deploys a PR preview environment on Cloud Run.
# Triggers on: PR open, push, or reopen against main.

name: Deploy PR Preview

on:
  pull_request:
    types: [opened, synchronize, reopened]
    branches: [main]

env:
  PROJECT_ID: {PROJECT_ID}
  REGION: {REGION}
  IMAGE_REPO: {IMAGE_REPO}
  RUNTIME_SA: {APP_NAME}-pr-preview-runtime-sa@{PROJECT_ID}.iam.gserviceaccount.com
  CLOUD_SQL_INSTANCE: {CLOUD_SQL_INSTANCE}
  CLOUD_SQL_INSTANCE_SHORT: {CLOUD_SQL_INSTANCE_SHORT}

jobs:
  deploy:
    runs-on: ubuntu-latest
    permissions:
      pull-requests: write   # to post PR comments
      checks: read           # to poll Cloud Build check status

    steps:
      - name: Checkout
        uses: actions/checkout@v4

      # Authenticate to GCP using the deployer SA JSON key
      - name: Authenticate to GCP
        uses: google-github-actions/auth@v2
        with:
          credentials_json: ${{ secrets.GCP_PR_PREVIEW_DEPLOYER_SA_KEY }}

      - name: Set up Cloud SDK
        uses: google-github-actions/setup-gcloud@v2
        with:
          # Include cloud-sql-proxy if your app uses Cloud SQL
          install_components: cloud-sql-proxy

      # Generate all resource names from the PR number
      - name: Set preview environment variables
        run: |
          PR_NUM="${{ github.event.pull_request.number }}"
          echo "PR_NUM=${PR_NUM}" >> $GITHUB_ENV
          echo "API_SERVICE={APP_NAME}-pr-${PR_NUM}-api" >> $GITHUB_ENV
          echo "FRONTEND_SERVICE={APP_NAME}-pr-${PR_NUM}-frontend" >> $GITHUB_ENV
          echo "PROXY_SERVICE={APP_NAME}-pr-${PR_NUM}-proxy" >> $GITHUB_ENV
          echo "DB_NAME={APP_PREFIX}_pr_${PR_NUM}" >> $GITHUB_ENV
          # Remove the DB_NAME line if your app does not use a database

      # Wait for Cloud Build to finish building images for this commit.
      # The check name must match what your Cloud Build trigger reports to GitHub.
      - name: Wait for Docker images
        env:
          GH_TOKEN: ${{ github.token }}
        run: |
          echo "Waiting for Cloud Build check: {CHECK_NAME}"
          for i in $(seq 1 60); do
            CONCLUSION=$(gh api \
              /repos/${{ github.repository }}/commits/${{ github.sha }}/check-runs \
              --jq '.check_runs[] | select(.name == "{CHECK_NAME}") | .conclusion' \
              2>/dev/null | head -1)
            echo "Attempt ${i}/60: conclusion=${CONCLUSION}"
            if [ "$CONCLUSION" = "success" ]; then
              echo "Images ready."
              exit 0
            fi
            if [ "$CONCLUSION" = "failure" ] || [ "$CONCLUSION" = "cancelled" ]; then
              echo "Cloud Build failed. Aborting deployment."
              exit 1
            fi
            sleep 30
          done
          echo "Timed out waiting for images after 30 minutes."
          exit 1

      # Delete the existing API service to release database connections before
      # dropping and recreating the database. Skip if your app has no database.
      - name: Delete existing API service (release DB connections)
        run: |
          gcloud run services delete "${API_SERVICE}" \
            --project="${PROJECT_ID}" \
            --region="${REGION}" \
            --quiet 2>/dev/null || true
          sleep 10

      # Create a fresh PR database on the shared Cloud SQL instance.
      # Remove this step if your app does not require a per-PR database.
      - name: Create PR database
        run: |
          # Drop existing database (fresh schema prevents migration conflicts)
          gcloud sql databases delete "${DB_NAME}" \
            --instance="${CLOUD_SQL_INSTANCE_SHORT}" \
            --project="${PROJECT_ID}" \
            --quiet 2>/dev/null || true

          # Create the PR database
          gcloud sql databases create "${DB_NAME}" \
            --instance="${CLOUD_SQL_INSTANCE_SHORT}" \
            --project="${PROJECT_ID}"

          # Grant schema privileges to the app user via a temporary admin user.
          # Adjust the GRANT statement for your database schema requirements.
          TMP_PASS=$(openssl rand -base64 32)
          gcloud sql users create pr_preview_tmp_admin \
            --instance="${CLOUD_SQL_INSTANCE_SHORT}" \
            --project="${PROJECT_ID}" \
            --password="${TMP_PASS}"

          # Connect via cloud-sql-proxy to run the GRANT
          cloud-sql-proxy "${CLOUD_SQL_INSTANCE}" --port=5433 &
          PROXY_PID=$!
          sleep 3
          PGPASSWORD="${TMP_PASS}" psql \
            -h 127.0.0.1 -p 5433 \
            -U pr_preview_tmp_admin \
            -d "${DB_NAME}" \
            -c "GRANT CREATE ON SCHEMA public TO {DB_APP_USER};"
          kill $PROXY_PID

          gcloud sql users delete pr_preview_tmp_admin \
            --instance="${CLOUD_SQL_INSTANCE_SHORT}" \
            --project="${PROJECT_ID}" \
            --quiet

      # Deploy the API service.
      # Customize --set-env-vars and --set-secrets for your application.
      - name: Deploy API
        run: |
          gcloud run deploy "${API_SERVICE}" \
            --image="${IMAGE_REPO}/api:${{ github.sha }}" \
            --project="${PROJECT_ID}" \
            --region="${REGION}" \
            --service-account="${RUNTIME_SA}" \
            --add-cloudsql-instances="${CLOUD_SQL_INSTANCE}" \
            --set-env-vars="DB_NAME=${DB_NAME}" \
            --cpu=1 \
            --memory=512Mi \
            --min-instances=0 \
            --max-instances=2 \
            --labels="pr-number=${PR_NUM}" \
            --allow-unauthenticated \
            --port=8080

      # Deploy the frontend service.
      # Adjust port and memory based on your frontend's requirements.
      - name: Deploy Frontend
        run: |
          gcloud run deploy "${FRONTEND_SERVICE}" \
            --image="${IMAGE_REPO}/frontend:${{ github.sha }}" \
            --project="${PROJECT_ID}" \
            --region="${REGION}" \
            --service-account="${RUNTIME_SA}" \
            --cpu=1 \
            --memory=256Mi \
            --min-instances=0 \
            --max-instances=2 \
            --labels="pr-number=${PR_NUM}" \
            --allow-unauthenticated \
            --port=80

      # Deploy the proxy service that routes traffic between API and frontend.
      # Remove this step if your app is a single service (no proxy needed).
      - name: Deploy Proxy
        run: |
          API_URL=$(gcloud run services describe "${API_SERVICE}" \
            --project="${PROJECT_ID}" --region="${REGION}" \
            --format="value(status.url)")
          FRONTEND_URL=$(gcloud run services describe "${FRONTEND_SERVICE}" \
            --project="${PROJECT_ID}" --region="${REGION}" \
            --format="value(status.url)")

          gcloud run deploy "${PROXY_SERVICE}" \
            --image="${IMAGE_REPO}/proxy_router:${{ github.sha }}" \
            --project="${PROJECT_ID}" \
            --region="${REGION}" \
            --service-account="${RUNTIME_SA}" \
            --set-env-vars="API_UPSTREAM=${API_URL}" \
            --set-env-vars="FRONTEND_UPSTREAM=${FRONTEND_URL}" \
            --cpu=1 \
            --memory=256Mi \
            --min-instances=0 \
            --max-instances=2 \
            --labels="pr-number=${PR_NUM}" \
            --allow-unauthenticated \
            --port=30000

      - name: Get preview URL
        run: |
          PROXY_URL=$(gcloud run services describe "${PROXY_SERVICE}" \
            --project="${PROJECT_ID}" --region="${REGION}" \
            --format="value(status.url)")
          echo "PROXY_URL=${PROXY_URL}" >> $GITHUB_ENV

      # Post a comment on the PR with the access URL.
      - name: Post PR comment
        uses: actions/github-script@v7
        with:
          script: |
            const proxyUrl = process.env.PROXY_URL;
            const sha = context.sha.substring(0, 7);
            const qrUrl = `https://api.qrserver.com/v1/create-qr-code/?size=150x150&data=${encodeURIComponent(proxyUrl)}`;
            const body = [
              `## PR Preview Environment`,
              ``,
              `**URL:** ${proxyUrl}`,
              `**Commit:** \`${sha}\``,
              ``,
              `![QR Code](${qrUrl})`,
            ].join('\n');
            await github.rest.issues.createComment({
              owner: context.repo.owner,
              repo: context.repo.repo,
              issue_number: context.issue.number,
              body,
            });

      - name: Write job summary
        run: |
          echo "## PR Preview Deployed" >> $GITHUB_STEP_SUMMARY
          echo "" >> $GITHUB_STEP_SUMMARY
          echo "**URL:** ${PROXY_URL}" >> $GITHUB_STEP_SUMMARY
          echo "**Commit:** ${{ github.sha }}" >> $GITHUB_STEP_SUMMARY

14.5 Phase 4: Cleanup Workflow

Create .github/workflows/cleanup-preview.yml in your application repository:

# .github/workflows/cleanup-preview.yml
# Tears down the PR preview environment when the PR is closed or merged.

name: Cleanup PR Preview

on:
  pull_request:
    types: [closed]
    branches: [main]

env:
  PROJECT_ID: {PROJECT_ID}
  REGION: {REGION}
  CLOUD_SQL_INSTANCE_SHORT: {CLOUD_SQL_INSTANCE_SHORT}

jobs:
  cleanup:
    runs-on: ubuntu-latest
    permissions:
      pull-requests: write

    steps:
      - name: Authenticate to GCP
        uses: google-github-actions/auth@v2
        with:
          credentials_json: ${{ secrets.GCP_PR_PREVIEW_DEPLOYER_SA_KEY }}

      - name: Set up Cloud SDK
        uses: google-github-actions/setup-gcloud@v2

      # Generate the same resource names using the PR number
      - name: Set preview environment variables
        run: |
          PR_NUM="${{ github.event.pull_request.number }}"
          echo "PR_NUM=${PR_NUM}" >> $GITHUB_ENV
          echo "API_SERVICE={APP_NAME}-pr-${PR_NUM}-api" >> $GITHUB_ENV
          echo "FRONTEND_SERVICE={APP_NAME}-pr-${PR_NUM}-frontend" >> $GITHUB_ENV
          echo "PROXY_SERVICE={APP_NAME}-pr-${PR_NUM}-proxy" >> $GITHUB_ENV
          echo "DB_NAME={APP_PREFIX}_pr_${PR_NUM}" >> $GITHUB_ENV

      # Capture proxy URL before deletion (needed for Auth0 cleanup if applicable)
      - name: Get proxy URL
        run: |
          PROXY_URL=$(gcloud run services describe "${PROXY_SERVICE}" \
            --project="${PROJECT_ID}" --region="${REGION}" \
            --format="value(status.url)" 2>/dev/null || echo "")
          echo "PROXY_URL=${PROXY_URL}" >> $GITHUB_ENV

      # If your app uses Auth0 or another OAuth provider:
      # Remove the preview URL from the provider's allowed callback URLs here.
      # See Section 10 for the Auth0 Management API pattern.

      # Drop the PR database. Skip if your app has no per-PR database.
      - name: Drop PR database
        continue-on-error: true
        run: |
          gcloud sql databases delete "${DB_NAME}" \
            --instance="${CLOUD_SQL_INSTANCE_SHORT}" \
            --project="${PROJECT_ID}" \
            --quiet

      # Delete all Cloud Run services. continue-on-error ensures one failure
      # does not prevent the remaining services from being deleted.
      - name: Delete API service
        continue-on-error: true
        run: |
          gcloud run services delete "${API_SERVICE}" \
            --project="${PROJECT_ID}" \
            --region="${REGION}" \
            --quiet

      - name: Delete Frontend service
        continue-on-error: true
        run: |
          gcloud run services delete "${FRONTEND_SERVICE}" \
            --project="${PROJECT_ID}" \
            --region="${REGION}" \
            --quiet

      - name: Delete Proxy service
        continue-on-error: true
        run: |
          gcloud run services delete "${PROXY_SERVICE}" \
            --project="${PROJECT_ID}" \
            --region="${REGION}" \
            --quiet

      - name: Post cleanup comment
        uses: actions/github-script@v7
        with:
          script: |
            await github.rest.issues.createComment({
              owner: context.repo.owner,
              repo: context.repo.repo,
              issue_number: context.issue.number,
              body: '## PR Preview Cleaned Up\n\nAll preview resources for PR #' +
                    context.issue.number + ' have been deleted.',
            });

14.6 Phase 5: Secrets Setup

After Terraform apply and workflow files are in place, configure the required secrets.

Step 1: Generate and store SA keys

# From the infrahive root directory
# Adapt this for your SA emails and secret names
DEPLOYER_EMAIL="{APP_NAME}-pr-preview-deployer-sa@{PROJECT_ID}.iam.gserviceaccount.com"
RUNTIME_EMAIL="{APP_NAME}-pr-preview-runtime-sa@{PROJECT_ID}.iam.gserviceaccount.com"

# Create deployer key
gcloud iam service-accounts keys create /tmp/deployer-key.json \
  --iam-account="${DEPLOYER_EMAIL}" \
  --project="{PROJECT_ID}"

# Store in Secret Manager
gcloud secrets versions add "{APP_NAME}-pr-preview-deployer-sa-key" \
  --data-file=/tmp/deployer-key.json \
  --project="{PROJECT_ID}"

# Create runtime key (stored for reference)
gcloud iam service-accounts keys create /tmp/runtime-key.json \
  --iam-account="${RUNTIME_EMAIL}" \
  --project="{PROJECT_ID}"

gcloud secrets versions add "{APP_NAME}-pr-preview-runtime-sa-key" \
  --data-file=/tmp/runtime-key.json \
  --project="{PROJECT_ID}"

rm /tmp/deployer-key.json /tmp/runtime-key.json

Step 2: Add GitHub Actions secrets

# Requires gh CLI and repo admin access
# Set the deployer SA key as a GitHub Actions secret
gh secret set GCP_PR_PREVIEW_DEPLOYER_SA_KEY \
  --body="$(gcloud secrets versions access latest \
    --secret="{APP_NAME}-pr-preview-deployer-sa-key" \
    --project="{PROJECT_ID}")" \
  --repo={GITHUB_ORG}/{REPO_NAME}

# If using Auth0:
gh secret set AUTH0_MANAGEMENT_CLIENT_SECRET --repo={GITHUB_ORG}/{REPO_NAME}
gh secret set AUTH0_APP_CLIENT_ID --repo={GITHUB_ORG}/{REPO_NAME}

Secrets checklist:

Secret Location Status
GCP_PR_PREVIEW_DEPLOYER_SA_KEY GitHub Actions [ ] Set
AUTH0_MANAGEMENT_CLIENT_SECRET GitHub Actions [ ] Set (if using Auth0)
AUTH0_APP_CLIENT_ID GitHub Actions [ ] Set (if using Auth0)
{APP_NAME}-pr-preview-deployer-sa-key GCP Secret Manager [ ] Populated
{APP_NAME}-pr-preview-runtime-sa-key GCP Secret Manager [ ] Populated

14.7 Phase 6: Verification

Step 1: Verify Terraform resources exist

# Confirm service accounts were created
gcloud iam service-accounts list \
  --filter="email:{APP_NAME}-pr-preview" \
  --project="{PROJECT_ID}"

# Confirm secrets exist in Secret Manager
gcloud secrets list \
  --filter="name:{APP_NAME}-pr-preview" \
  --project="{PROJECT_ID}"

Step 2: Verify cross-project IAM bindings

# Confirm Cloud Run Service Agent has image pull access
gcloud projects get-iam-policy "{IMAGE_PROJECT}" \
  --flatten="bindings[].members" \
  --filter="bindings.role:roles/artifactregistry.reader" \
  --format="table(bindings.members)"

Step 3: Open a test PR

  1. Create a branch with a small change (for example, a comment edit in any file)
  2. Open a PR against main
  3. Monitor the deploy-preview.yml workflow run in GitHub Actions
  4. Verify:

Step 4: Verify cleanup

  1. Close (without merging) the test PR
  2. Monitor the cleanup-preview.yml workflow run
  3. Verify:

Step 5: Verify re-deploy on push

  1. Reopen the test PR (or create a new one)
  2. Push a new commit
  3. Verify:

15. Appendix: Source File Reference

Infrahive Repository

File Repository Purpose
infra/pd-infra/business_unit_1/non-production/pr_preview_deployer.tf infrahive Terraform for deployer SA, runtime SA, IAM bindings, Secret Manager secret shells, and Cloud Run Service Agent cross-project bindings
infra/pd-infra/business_unit_1/non-production/outputs.tf infrahive Terraform outputs including pr_preview_deployer_sa_email, pr_preview_deployer_sa_key_secret_id, and pr_preview_runtime_sa_email
scripts/create-pr-preview-sa-keys.sh infrahive Helper script that reads Terraform outputs, creates JSON SA keys with gcloud iam service-accounts keys create, and stores them in Secret Manager with gcloud secrets versions add
pr_preview_environments.mmd infrahive Original Mermaid flowchart diagram of the PR preview lifecycle (architectural reference)
docs/architecture/pr-preview-environments.md infrahive This document

Consumer Portal Repository (the_consumer_portal)

File Repository Purpose
.github/workflows/deploy-preview.yml the_consumer_portal GitHub Actions workflow that deploys PR preview environments on PR open/synchronize/reopen
.github/workflows/cleanup-preview.yml the_consumer_portal GitHub Actions workflow that tears down PR preview environments on PR close/merge
build/cloudbuild.yml the_consumer_portal Cloud Build configuration: Kaniko parallel image builds for API, frontend, and proxy router
compose/api/Dockerfile the_consumer_portal Dockerfile for the Spring Boot API service
compose/frontend/Dockerfile the_consumer_portal Dockerfile for the React/Vite frontend served by Caddy
compose/proxy_router/Dockerfile the_consumer_portal Dockerfile for the Caddy reverse proxy router
compose/proxy_router/Caddyfile the_consumer_portal Caddy configuration that routes /api/* to the API service and all other paths to the frontend service
api/src/main/resources/application-staging.yml the_consumer_portal Spring Boot staging profile configuration, including the CORS wildcard pattern for preview environment URLs
Edit this page