GitHub

CFO

The Continuous Fuzzing Orchestrator (CFO) is a general-purpose fuzz testing harness built into the infrahive build system. It manages the full lifecycle of fuzz tests: discovering tasks, running them concurrently under a time budget, collecting structured results, and persisting them to GCS with deduplication and merge strategies. CFO is not specific to any one domain — it is a framework for running any registered fuzzer against any set of targets and branches.

Two fuzzers are currently registered: CANARY, a trivial Python fuzzer that always fails and serves as a sanity check for the harness itself, and INFRA_DRIFT, a Go fuzzer that detects when deployed cloud resources have drifted from the Terraform code that defines them

Architecture

graph TD
    CFO["CFO
(Python orchestrator)"] FUZZ["fuzz.zig
(Zig fuzz runner)"] CANARY["canary
(Python)"] INFRA_DRIFT["infra_drift
(Go)"] TFO["tfo plan | tfparse | jq"] CFO --> FUZZ FUZZ --> CANARY FUZZ --> INFRA_DRIFT INFRA_DRIFT --> TFO
Component Language File Role
CFO Python src/scripts/cfo.py Orchestrator: task generation, concurrent execution, seed record management, GCS upload
fuzz.zig Zig src/fuzz.zig Fuzz runner: discovers and invokes fuzzers from src/fuzz_tests/
canary Python src/fuzz_tests/canary.py Trivial fuzzer that always fails, validating the harness works
infra_drift Go src/fuzz_tests/infra_drift/ Drift detector: runs terraform plan per module, reports change count

How It Works

CFO orchestrates fuzz runs through a six-step lifecycle:

  1. Task generation — For each registered fuzzer, CFO generates all fuzzer+branch+target task combinations. For INFRA_DRIFT, it calls ./bin/fuzz infra_drift --discover to enumerate all ~96 Terraform modules across 5 GCP projects and 3 branches. For CANARY, a single task is produced.

  2. Working directory setup — CFO creates git worktrees for each required branch. Required binaries are built once and hard-linked into each worktree to avoid redundant compilation across branches.

  3. Concurrent execution — All tasks are submitted to a thread pool (4 workers in CI, CPU count locally). The total budget is 60 minutes with a per-task cap of 30 minutes. Each task runs in its own process group with SIGTERM/SIGKILL lifecycle management and watchdog timers to enforce the budget.

  4. Result collection — Each completed task produces a SeedRecord capturing the commit SHA, fuzzer type, pass/fail outcome, branch, target, duration, and the exact command that was run.

  5. Seed merge — CFO downloads the existing seed dataset from GCS (devhivedb/infrahive/fuzzing/data.json). New results are merged into the dataset using each fuzzer’s declared MergeStrategy. The dataset is kept to a sliding 32-commit window with deduplication applied across all records.

  6. Upload — The merged dataset is pushed back to GCS. Upload only occurs when the --upload flag is passed, allowing dry runs and local development without touching the shared state.

Seed Records

A SeedRecord captures a single fuzz run result:

Field Type Description
commit string Git commit SHA at the time of the run
fuzzer Fuzzer Which fuzzer produced this record
ok bool Whether the task passed
count int Number of targets checked (e.g. modules for INFRA_DRIFT)
branch string Branch the task ran against
target string Specific target (e.g. module path)
duration float Wall-clock seconds the task took
command string Exact command that was executed

Each Fuzzer variant in the Fuzzer IntEnum declares its merge strategy:

Strategy Behavior
SUPERSEDE A newer passing result replaces an older failing result for the same target. Used by INFRA_DRIFT — a module that now plans cleanly supersedes a prior drift detection.
ACCUMULATE Failures pile up rather than being replaced. Used by CANARY — every failure is preserved as evidence the harness ran.

The seed dataset is stored at devhivedb/infrahive/fuzzing/data.json in GCS and maintains a sliding window of the 32 most recent commits. Records outside the window are pruned on each upload.

CI Integration

CircleCI is the primary execution environment. CFO runs on an hourly schedule, Monday through Friday, 7 AM–5 PM CST, against the plan branch. The job passes --concurrency 4 --upload so results are persisted to GCS after each run. The cfo-runner service account provides both GCS write access for seed upload and Terraform impersonation for INFRA_DRIFT plan operations.

GitHub Actions runs the CFO test suite on Linux as part of the CI tests workflow. It does not execute fuzz tasks — only --typecheck (pyright) and --test (pytest) are invoked. This ensures the orchestrator’s own logic is validated on every pull request without requiring GCS access or Terraform credentials.

CLI Reference

./zig/zig build scripts -- cfo [flags]
Flag Description
--dry-run Print all generated tasks without executing any of them
--concurrency N Maximum concurrent tasks (default: CPU count)
--save-logs Save per-task stdout/stderr to cfo_logs/
--upload Upload merged seed records to GCS after execution
--test Run the embedded pytest suite (~40 tests)
--typecheck Run pyright type checker against cfo.py

Adding a New Fuzzer

Register a new variant in the Fuzzer IntEnum inside src/scripts/cfo.py. Each variant declares its properties inline:

  • branches — which git branches to run against
  • needs_working_dir — whether a git worktree must be set up
  • needs_seed — whether prior seed records are loaded before execution
  • is_critical — whether a failure blocks the overall run result
  • merge_strategySUPERSEDE or ACCUMULATE

After adding the variant, pyright’s exhaustive match checking will flag every match fuzzer: block in the codebase that does not handle the new case. Fix each flagged location to integrate the fuzzer into task generation, execution, and result reporting. The fuzz test implementation itself goes in src/fuzz_tests/ and is discovered automatically by fuzz.zig.

  • Canary is a trivial fuzzer that always fails, validating that the fuzzing infrastructure itself is working correctly
  • Infrastructure Drift detects when deployed cloud resources have drifted from the Terraform code that defines them
Edit this page