GitHub

Secret Management

The vault system is the mechanism by which application secrets reach running services across all THB infrastructure projects. It addresses a specific security threat: if a single GCP project is compromised, an attacker should not gain access to production secrets. The solution is defense in depth through project-level isolation — credentials that unlock secrets live in a separate GCP project from the secrets themselves. Compromising the application project gives an attacker neither the key to the vault nor the vault contents.

The core insight is a two-project separation. One GCP project holds service account credentials (vault-keys); another holds application secrets (vault-secrets). An app must traverse both projects to obtain its secrets, and neither project alone is sufficient. Two architectural variants implement this pattern: GCP-native (bees-infra, hb-infra, pd-infra), where each VM has its own GCP identity via the metadata server, and hybrid Azure/GCP (ha-infra), where Azure VMs carry a shared provisioner service account key staged at VM boot.

The Two-Project Vault Pattern

Each application gets a dedicated service account and two secrets. The service account lives in vault-keys alongside its private key JSON. The application’s actual secrets live in vault-secrets. An app authenticates as its vault SA to reach its secrets; the provisioner identity can fetch the key but never the secrets it unlocks.

graph LR
    subgraph vault-keys ["prj-bu1-{env}-vault-keys-{hash}"]
        SA["{app}-vault-sa
Service Account"] KEY["{app}-vault-key
SA private key JSON"] end subgraph vault-secrets ["prj-bu1-{env}-vault-secrets-{hash}"] SECRET["{app}-vault
App secrets JSON"] end PROV["Provisioner Identity"] -->|secretAccessor| KEY KEY -->|activate-service-account| SA SA -->|secretAccessor| SECRET

The following resources are created per application by the vault module:

# Resource Project Name Pattern Notes
1 Service Account vault-keys {app}-vault-sa Identity the app assumes at runtime
2 Secret (key container) vault-keys {app}-vault-key Stores the SA’s private key JSON
3 IAM binding vault-keys secretmanager.secretAccessor on key secret Grants provisioner identity read access to the key
4 Secret (app secrets) vault-secrets {app}-vault Stores the application’s secret values as JSON
5 IAM binding vault-secrets secretmanager.secretAccessor on app secret Grants {app}-vault-sa read access to app secrets

The vault projects follow the standard GCP project naming convention:

Environment vault-keys project vault-secrets project
dev prj-bu1-d-vault-keys-{hash} prj-bu1-d-vault-secrets-{hash}
non-prod prj-bu1-n-vault-keys-{hash} prj-bu1-n-vault-secrets-{hash}
prod prj-bu1-p-vault-keys-{hash} prj-bu1-p-vault-secrets-{hash}

Authentication Chain

All variants share the same structure: a provisioner identity fetches the SA key from vault-keys, activates as the vault SA, then fetches app secrets from vault-secrets. The variants differ only in where the provisioner identity comes from.

GCP-Native

Used by bees-infra, hb-infra, and pd-infra. The provisioner identity is the VM’s own GCP instance service account, obtained automatically from the GCP metadata server. No credential file is staged on the VM — GCP handles the identity.

sequenceDiagram
    participant VM as VM Instance SA
(GCP metadata server) participant VK as vault-keys project participant VS as vault-secrets project participant APP as Application VM->>VK: fetch {app}-vault-key (secretAccessor) VK-->>VM: SA private key JSON VM->>VM: gcloud auth activate-service-account VM->>VS: fetch {app}-vault (as {app}-vault-sa) VS-->>APP: App secrets JSON APP->>APP: Run with secrets in memory

The gcp_secret_vault module from the shared terraform-modules repo creates all five resources listed above, with the VM’s instance SA as the provisioner identity.

Hybrid Azure/GCP

Used by ha-infra only. Azure VMs have no GCP metadata server, so they cannot obtain a GCP identity automatically. Instead, a shared provisioner service account key is staged onto each VM during cloud-init via /etc/gcp/service-account.json. This key is written to the VM at boot time and activated for use by the deploy user.

sequenceDiagram
    participant INIT as cloud-init
    participant VM as Azure VM
    participant VK as vault-keys project
    participant VS as vault-secrets project
    participant APP as Application

    INIT->>VM: write /etc/gcp/service-account.json
    VM->>VM: gcloud auth activate-service-account
(provisioner SA from /etc/gcp/service-account.json) VM->>VK: fetch {app}-vault-key (secretAccessor) VK-->>VM: App SA private key JSON VM->>VM: gcloud auth activate-service-account
(as {app}-vault-sa) VM->>VS: fetch {app}-vault VS-->>APP: App secrets JSON APP->>APP: Run with secrets in memory

Because all Azure VMs share the same provisioner SA, the provisioner is granted secretAccessor on every app’s vault-key secret in the vault-keys project. It has no access to vault-secrets — only the per-app vault SA can read app secrets.

ha-infra defines its vault resources inline rather than through the gcp_secret_vault module because the hybrid pattern requires a shared provisioner SA that the GCP-native module does not support.

Naming Conventions

Secret and service account names follow a consistent {app}-vault-* pattern:

Resource Pattern Example
Service Account {app}-vault-sa ha-humana-vault-sa
Key Secret {app}-vault-key ha-humana-vault-key
App Secret {app}-vault ha-humana-vault

The {app} value in these names is a normalized form of the application name. The source of that normalization depends on the infrastructure project:

Project Transformation
bees-infra None — app name used as-is
pd-infra None — app name used as-is
hb-infra benefits-platformbh, homealignha
ha-infra 6-step normalization (see below)

ha-infra Name Normalization

GCP service account IDs have a 30-character maximum. The -vault-sa suffix is 9 characters, leaving 21 characters for the app name portion. ha-infra app names come from Windows naming conventions and often exceed this limit, so vault.tf applies a deterministic normalization:

  1. Basic normalization — lowercase, replace invalid characters (anything not [a-z0-9-]) with hyphens, abbreviate ha_admin_ prefix to ha-, collapse repeated hyphens, trim leading/trailing hyphens
  2. Truncate — clip to 21 characters
  3. Group by normalized value — detect which app names collide after truncation
  4. Disambiguate collisions — for colliding names, truncate to 17 characters and append - + first 3 characters of the MD5 hash of the original app name
  5. Validate no remaining collisions — Terraform precondition blocks enforce this; the apply fails if duplicates remain
  6. Validate GCP requirements — all IDs must start with a letter and be 1–21 characters

Example: ha_admin_humana → step 1 normalizes to ha-humana (9 chars, under limit, no collision) → SA ID: ha-humana-vault-sa

Provisioning a New App

Adding a new app to the vault requires three steps:

  1. Add the app to tfvars — set the app’s name and VM assignment in the project’s .auto.tfvars
  2. Run terraform apply — creates the vault SA, key secret container, app secret container, and IAM bindings
  3. Run vault_post_apply — creates the SA key and stores it in the key secret

Step 3 is separate from Terraform because creating a service account key produces a private key JSON. If Terraform managed this, the private key would be stored in Terraform state — a significant security exposure. vault_post_apply creates the key outside Terraform and stores it directly in Secret Manager.

After vault_post_apply completes, populate the app secret with actual secret values in the vault-secrets project.

vault_post_apply

vault_post_apply is a Python script that bridges the gap between terraform apply and a running application. It performs five steps:

  1. Verify vault resources exist — checks the SA, key secret, app secret, and IAM bindings are all present; fails with a clear error if terraform apply hasn’t been run
  2. Check for a valid existing key — reads the latest version of {app}-vault-key, validates it is well-formed JSON with private_key and client_email fields, and confirms the key ID still exists on the SA; if valid, skips key creation
  3. Create a new SA key — runs gcloud iam service-accounts keys create to produce a new private key JSON
  4. Store the key in the secret — runs gcloud secrets versions add {app}-vault-key to add the key as a new secret version
  5. Test end-to-end access — activates as the vault SA using the stored key, attempts to read {app}-vault from vault-secrets, and retries with a delay to handle GCP key propagation (a new key can take several seconds to become usable)

CLI:

./zig/zig build scripts -- vault_post_apply --infra <project> --app <name> --env <d|n|p>

Vault Across Projects

Project Uses Module Name Transform Provisioner Identity Auth Model
bees-infra Yes (gcp_secret_vault) None Per-VM instance SA GCP-native
hb-infra Yes (gcp_secret_vault) benefits-platformbh, homealignha Per-VM instance SA GCP-native
pd-infra Yes (gcp_secret_vault) None Per-VM instance SA GCP-native
ha-infra No (inline) 6-step normalization Shared provisioner SA Hybrid Azure/GCP
common-infra No vault infrastructure

ha-infra defines vault resources inline because the hybrid Azure/GCP pattern requires a shared provisioner SA that the gcp_secret_vault module does not support. common-infra uses standalone GCP Secret Manager secrets without the two-project separation.

Source

File Purpose
infra/ha-infra/business_unit_1/production/vault.tf Inline vault resources with 6-step name normalization
infra/hb-infra/business_unit_1/production/vault.tf Module-based vault with benefits-platform/homealign abbreviations
infra/pd-infra/business_unit_1/production/vault.tf Module-based vault, root VMs only
infra/bees-infra/business_unit_1/production/vault.tf Module-based vault, simplest configuration
infra/ha-infra/modules/azure_linux_vm/scripts/cloud-config.yaml cloud-init that stages provisioner SA key at /etc/gcp/service-account.json
src/scripts/vault_post_apply/main.py Post-apply script for SA key creation and end-to-end verification
Edit this page