Infrastructure Drift
The infrastructure drift fuzzer detects when deployed cloud resources
have diverged from the Terraform code that defines them. It runs
terraform plan against every discovered module and reports
any with a non-zero change count as drift. CFO invokes it on a schedule
across all environment branches.
Running
Discover all modules:
./zig/zig build fuzz -- infra_drift --discoverRun a single module:
./zig/zig build fuzz -- infra_drift --infra=hb-infra --branch=production --module=vaultDry run (print the plan command without executing):
./zig/zig build fuzz -- infra_drift --infra=hb-infra --branch=production --module=vault --dry-runHow It Works
CFO invokes the infra_drift binary in two stages. First,
discovery mode enumerates all modules across all tracked branches and
emits them as a JSON array. Second, CFO fans out — invoking
infra_drift once per module in single-module mode, running
plans concurrently across workers. Each single-module invocation runs
the full detection pipeline and exits 0 (no drift) or 1 (drift found or
execution error).
The development branch is excluded. Only production,
shared, and non-production are checked.
Module Discovery
Discovery scans infra/ for directories with a
-infra suffix, then walks each project directory looking
for backend.tf files. Every directory containing
backend.tf is a Terraform module. The path structure
encodes three pieces of information:
infra/<project>/<business_unit>/<branch>/<module>/backend.tf
Discovery extracts the project, branch, and module name from the path, filters to the target branch, and returns the full list.
Across all five infrastructure projects and three environment branches, discovery yields approximately 96 modules.
Infrastructure Projects
| Project | Description |
|---|---|
bees-infra |
Bees infrastructure |
common-infra |
Shared/common infrastructure |
ha-infra |
HA (Azure-based) infrastructure |
hb-infra |
HB (GCP-based) infrastructure |
pd-infra |
PD infrastructure |
The Detection Pipeline
Each module runs through a three-stage pipeline:
flowchart LR
A["tfo plan\n<project> <branch> <module>"] --> B["tfparse"]
B --> C["jq '.[0].change_count'"]
C --> D{"> 0?"}
D -->|yes| E["DRIFT"]
D -->|no| F["OK"]
tfo plan runs terraform plan through TFO
with service account impersonation. tfparse parses the plan
output into structured JSON. jq extracts the
change_count field. Any value greater than zero means the
deployed infrastructure no longer matches the code.
The pipeline runs as a bash one-liner with pipefail set
so any stage failure propagates as an error:
tfo plan <project> <branch> <module> | tfparse | jq '.[0].change_count'Transient API failures (gateway timeouts, TLS errors, provider download failures) are retried up to three times with exponential backoff before the module is recorded as an error.
Source
src/fuzz_tests/infra_drift/main.go— entry point, CLI flags, discovery modesrc/fuzz_tests/infra_drift/runner.go— single-plan execution and retry logicsrc/fuzz_tests/infra_drift/discovery.go— module discovery and path parsing