CFO
The Continuous Fuzzing Orchestrator (CFO) is a general-purpose fuzz testing harness built into the infrahive build system. It manages the full lifecycle of fuzz tests: discovering tasks, running them concurrently under a time budget, collecting structured results, and persisting them to GCS with deduplication and merge strategies. CFO is not specific to any one domain — it is a framework for running any registered fuzzer against any set of targets and branches.
Two fuzzers are currently registered: CANARY, a trivial Python fuzzer that always fails and serves as a sanity check for the harness itself, and INFRA_DRIFT, a Go fuzzer that detects when deployed cloud resources have drifted from the Terraform code that defines them
Architecture
graph TD
CFO["CFO
(Python orchestrator)"]
FUZZ["fuzz.zig
(Zig fuzz runner)"]
CANARY["canary
(Python)"]
INFRA_DRIFT["infra_drift
(Go)"]
TFO["tfo plan | tfparse | jq"]
CFO --> FUZZ
FUZZ --> CANARY
FUZZ --> INFRA_DRIFT
INFRA_DRIFT --> TFO
| Component | Language | File | Role |
|---|---|---|---|
| CFO | Python | src/scripts/cfo.py |
Orchestrator: task generation, concurrent execution, seed record management, GCS upload |
| fuzz.zig | Zig | src/fuzz.zig |
Fuzz runner: discovers and invokes fuzzers from
src/fuzz_tests/ |
| canary | Python | src/fuzz_tests/canary.py |
Trivial fuzzer that always fails, validating the harness works |
| infra_drift | Go | src/fuzz_tests/infra_drift/ |
Drift detector: runs terraform plan per module, reports
change count |
How It Works
CFO orchestrates fuzz runs through a six-step lifecycle:
Task generation — For each registered fuzzer, CFO generates all fuzzer+branch+target task combinations. For INFRA_DRIFT, it calls
./bin/fuzz infra_drift --discoverto enumerate all ~96 Terraform modules across 5 GCP projects and 3 branches. For CANARY, a single task is produced.Working directory setup — CFO creates git worktrees for each required branch. Required binaries are built once and hard-linked into each worktree to avoid redundant compilation across branches.
Concurrent execution — All tasks are submitted to a thread pool (4 workers in CI, CPU count locally). The total budget is 60 minutes with a per-task cap of 30 minutes. Each task runs in its own process group with SIGTERM/SIGKILL lifecycle management and watchdog timers to enforce the budget.
Result collection — Each completed task produces a
SeedRecordcapturing the commit SHA, fuzzer type, pass/fail outcome, branch, target, duration, and the exact command that was run.Seed merge — CFO downloads the existing seed dataset from GCS (
devhivedb/infrahive/fuzzing/data.json). New results are merged into the dataset using each fuzzer’s declaredMergeStrategy. The dataset is kept to a sliding 32-commit window with deduplication applied across all records.Upload — The merged dataset is pushed back to GCS. Upload only occurs when the
--uploadflag is passed, allowing dry runs and local development without touching the shared state.
Seed Records
A SeedRecord captures a single fuzz run result:
| Field | Type | Description |
|---|---|---|
commit |
string | Git commit SHA at the time of the run |
fuzzer |
Fuzzer |
Which fuzzer produced this record |
ok |
bool | Whether the task passed |
count |
int | Number of targets checked (e.g. modules for INFRA_DRIFT) |
branch |
string | Branch the task ran against |
target |
string | Specific target (e.g. module path) |
duration |
float | Wall-clock seconds the task took |
command |
string | Exact command that was executed |
Each Fuzzer variant in the Fuzzer IntEnum
declares its merge strategy:
| Strategy | Behavior |
|---|---|
SUPERSEDE |
A newer passing result replaces an older failing result for the same target. Used by INFRA_DRIFT — a module that now plans cleanly supersedes a prior drift detection. |
ACCUMULATE |
Failures pile up rather than being replaced. Used by CANARY — every failure is preserved as evidence the harness ran. |
The seed dataset is stored at
devhivedb/infrahive/fuzzing/data.json in GCS and maintains
a sliding window of the 32 most recent commits. Records outside the
window are pruned on each upload.
CI Integration
CircleCI is the primary execution environment. CFO
runs on an hourly schedule, Monday through Friday, 7 AM–5 PM CST,
against the plan branch. The job passes
--concurrency 4 --upload so results are persisted to GCS
after each run. The cfo-runner service account provides
both GCS write access for seed upload and Terraform impersonation for
INFRA_DRIFT plan operations.
GitHub Actions runs the CFO test suite on Linux as
part of the CI tests workflow. It does not execute fuzz tasks — only
--typecheck (pyright) and --test (pytest) are
invoked. This ensures the orchestrator’s own logic is validated on every
pull request without requiring GCS access or Terraform credentials.
CLI Reference
./zig/zig build scripts -- cfo [flags]
| Flag | Description |
|---|---|
--dry-run |
Print all generated tasks without executing any of them |
--concurrency N |
Maximum concurrent tasks (default: CPU count) |
--save-logs |
Save per-task stdout/stderr to cfo_logs/ |
--upload |
Upload merged seed records to GCS after execution |
--test |
Run the embedded pytest suite (~40 tests) |
--typecheck |
Run pyright type checker against cfo.py |
Adding a New Fuzzer
Register a new variant in the Fuzzer IntEnum inside
src/scripts/cfo.py. Each variant declares its properties
inline:
branches— which git branches to run againstneeds_working_dir— whether a git worktree must be set upneeds_seed— whether prior seed records are loaded before executionis_critical— whether a failure blocks the overall run resultmerge_strategy—SUPERSEDEorACCUMULATE
After adding the variant, pyright’s exhaustive match checking will
flag every match fuzzer: block in the codebase that does
not handle the new case. Fix each flagged location to integrate the
fuzzer into task generation, execution, and result reporting. The fuzz
test implementation itself goes in src/fuzz_tests/ and is
discovered automatically by fuzz.zig.
- Canary is a trivial fuzzer that always fails, validating that the fuzzing infrastructure itself is working correctly
- Infrastructure Drift detects when deployed cloud resources have drifted from the Terraform code that defines them