AFL Masterclass — A Technical Deep Dive

0 of 10 Sections

0 XP Earned

0 Days Complete

0 Myths Busted

Contents

01 —Mental Model — From First Principles
02 —Algorithmic Deep Dive
03 —Systems Engineering Decisions
04 —Adversarial Critique — Where AFL Breaks
05 —Build Intuition — Concrete Examples
06 —7-Day Hands-On Curriculum
07 —Code Reading Map
08 —Modern Relevance — OSS-Fuzz, Sanitizers, CI
09 —Destroy Your Misconceptions
10 —30-Day Practice Roadmap

01 —

Mental Model — From First Principles

⏱ ~25 min read Intro

What Problem Did Fuzzing Solve, and Why Was AFL a Step-Change?

Start here: why do bugs exist? Because software has an implicit contract between what the developer imagined the input space to be, and what it actually is. Bugs live in the gap between the developer's mental model and physical reality — the off-by-one in buffer length, the type confusion when parsing a malformed tag, the integer overflow that only appears with a specific combination of field values.

Before AFL, fuzzers attacked this problem in two flavors. Blind (dumb) fuzzers threw random mutations at a binary and watched it crash. No feedback. No learning. Discovering a deep code path was largely a lottery. Generation-based fuzzers required a grammar or specification — they understood the format and generated structurally valid inputs with deliberate corruption. Powerful in theory. Expensive in practice. Every new target format required new spec work.

AFL changed the economics. It combined the ease of a mutation fuzzer with a feedback signal that previously required manual specification: code coverage. If a mutated input causes the program to exercise a new branch — take a path it hadn't taken before — AFL keeps that input and uses it as a seed for further mutation. If it doesn't add coverage, discard it. This is the core idea. Everything else is engineering around that insight.

The Breakthrough in One Sentence

AFL operationalized the intuition that "interesting inputs make programs do new things" into a fast, automatable, seed-agnostic feedback loop — without requiring domain knowledge about the target format.

Coverage-Guided Greybox Fuzzing — The Right Mental Frame

The "greybox" label is precise. Whitebox fuzzing (symbolic execution, concolic testing) reasons about program semantics — it knows what the code does. Blackbox fuzzing (pure random, generation-based) ignores internals entirely. Greybox sits deliberately in the middle: AFL injects lightweight instrumentation to observe what happened (which edges were taken) without reasoning about why (what the values mean). This asymmetry is intentional. Full semantic reasoning is expensive and fragile. Coverage signals are cheap and composable.

Edge Coverage — Why Edges, Not Blocks?

Basic block coverage tells you which blocks executed. Edge coverage (also called branch coverage) tells you which transitions between blocks occurred. The difference is critical.

Consider: block A executes, then block B executes. With block coverage, this looks identical whether A→B happened or A→C→B happened. Edge coverage sees them as different traces. AFL records the edge (prev_block_id XOR curr_block_id) into a shared bitmap. This tuple representation captures local-scale control flow transitions efficiently without comparing full execution traces.

AFL's instrumentation stub (conceptual assembly)

; At the top of each basic block:
  cur_location = (compile-time random value);
  shared_mem[cur_location ^ prev_location]++;
  prev_location = cur_location >> 1;   ; shift prevents A→B == B→A

The right-shift ensures that the A→B edge gets a different bitmap slot than the B→A edge. Without it, loops would be invisible. The XOR combines two block IDs into one bitmap index. This is O(1) per basic block — essentially free at runtime.

AFL also tracks hit counts in buckets (AFL++ current values): [1], [2–3], [4–7], [8–15], [16–31], [32–63], [64–127], [128+]. A loop going from 47 to 48 iterations maps to the same bucket — irrelevant. A loop going from 3 to 4 iterations changes bucket — potentially interesting. This coarse quantization prevents trivial variation from flooding the queue while still capturing meaningful behavioral change. (Original AFL used a slightly different bucketing scheme; AFL++ refined it to eight evenly-spaced power-of-two ranges.)

Instrumentation — Two Modes

Compile-time instrumentation (afl-gcc / afl-clang): The compiler injects the coverage stub into every basic block during compilation. Near-zero runtime cost. This is the default and the right choice when you have source.

QEMU mode: For binaries without source. QEMU runs the binary in user-space emulation and instruments basic blocks dynamically. Cost: approximately 2–5× slower. Still useful for closed-source targets.

The Forkserver — Why It Matters

Every test case execution needs a clean process state. Naive approach: fork() + execve() for each input. This is expensive. Program initialization — dynamic linker, constructors, early main() work — runs every single time.

AFL's forkserver is a clean design: the instrumented binary runs once, past all initialization, then pauses and waits for AFL to signal it. When AFL needs to test an input, the forkserver fork()s — it does NOT execve() again. The child process inherits already-initialized state, runs the test, exits. The parent forkserver loops back and waits.

Performance Impact

For a typical target, this gives 1000+ executions/second instead of the 50–100 you'd get with naive exec-per-test. That is the difference between finding something in an hour or a week.

Corpus Minimization

afl-cmin reduces a corpus to the smallest set of inputs that collectively maintain the same coverage as the full corpus. It's not the same as minimizing individual files (that's afl-tmin). The goal: remove redundant seeds that cover the same edges as other, smaller/faster seeds. Smaller corpus → faster queue cycles → more mutation attempts per unit time.

Deterministic Mutation Stages

AFL's mutation pipeline is not random all the way through. It begins with a set of deterministic stages — exhaustive, reproducible, not random. For a given seed, these run to completion and are then marked done forever. They include:

Bit flips at widths 1/1, 2/1, 4/1, 8/8, 16/8, 32/8 — flip N bits, step by M
Arithmetic — add/subtract small integers (±35) to 8-, 16-, 32-bit integers at each offset, trying both endiannesses
Interesting values — inject boundary values like 0, 1, -1, INT_MAX, INT_MIN, 0x80, 0xFF, etc. at each offset
Dictionary injection — user-provided or auto-extracted tokens inserted/overwritten at every position

These stages are expensive in absolute terms but highly structured — they systematically cover the "near-the-seed" mutation space before the randomness starts.

Havoc Stage

After deterministic stages, AFL enters havoc: stacked random mutations. A random number of operators (between 1 and 128, geometrically distributed) are applied in sequence to the same input. Operators include: bit flip at random position, byte set to random value, block deletion, block insertion, block cloning, splicing in bytes from another seed. This is the "creative destruction" phase — it reaches combinations no deterministic stage could enumerate.

Splice Stage

Splice is genetic crossover. AFL picks another seed from the queue, finds a position where the two inputs diverge significantly, splices the tail of one onto the head of the other, then runs havoc on the result. This enables structural recombination — producing chimeras that inherit structure from two different interesting inputs. In practice, splice often unlocks bugs that require state from multiple distinct parser paths simultaneously.

Crash Triage

AFL separates crashes from hangs. Both go into output directories. Crashes are not automatically deduplicated by stack trace — AFL uses the coverage tuple to distinguish them. Two crashes that trigger different bitmap entries are saved separately. Two crashes that trigger identical tuples: only one is kept. This is fast but imprecise — it will both group distinct bugs and separate variants of the same bug depending on how control flow diverges.

Real triage requires: afl-tmin (minimize the crashing input), then reproduce under AddressSanitizer / GDB / lldb for root cause. The bitmap-based deduplication is a heuristic starting point, not a final answer.

02 —

Algorithmic Deep Dive

⏱ ~30 min read Intermediate

The AFL Execution Loop — Step by Step

Load seed corpus from the input directory. Calibrate each seed: measure execution time and coverage tuple. Mark as favorite if it uniquely covers edges not covered by smaller/faster seeds.
Select next queue entry. Round-robin with skip probabilities: non-favorite, already-fuzzed entries are skipped ~95% of the time. New favored entries get full attention. This biases work toward the frontier.
Calibrate if needed. Verify the entry is not flaky (variable coverage across runs). Unstable entries are flagged; their coverage bits are cleared to avoid false positives in the bitmap.
Trim the test case using afl-tmin's binary search algorithm: repeatedly halve the input, check if coverage is preserved, keep the smaller version if yes. Smaller inputs run faster and mutate more efficiently.
Run deterministic stages (if not already done for this entry). Each mutation is tested against the target. New coverage → save the mutated input as a new queue entry.
Compute performance score (perf_score). Factors: how many new tuples did this entry bring? Execution speed? File size? Entries that brought many cheap new edges score higher and get more havoc iterations.
Run havoc for a number of iterations proportional to perf_score. Each havoc round applies 1–128 stacked random operators. After each stacked round, run target. New coverage → enqueue.
Optionally splice with a random queue member, then repeat havoc on the spliced result.
Advance to next queue entry. When the full queue is exhausted, increment cycle counter, cull queue (remove entries superseded by later finds), loop.

Pseudocode

function afl_main_loop(initial_seeds, target):
    queue = calibrate_and_load(initial_seeds)

    while True:
        entry = pick_next(queue)          # round-robin, skip heuristics

        if not entry.calibrated:
            calibrate(entry, target)      # measure speed, verify stability

        entry.input = trim(entry, target) # minimize while preserving coverage

        if not entry.det_done:
            for stage in [bitflips, arith, interesting_values, dictionary]:
                for mutant in stage.generate(entry.input):
                    result = run(target, mutant)
                    if has_new_coverage(result, global_bitmap):
                        enqueue(queue, mutant)
            entry.det_done = True

        score = calculate_score(entry)    # new edges, speed, size
        havoc_rounds = base_rounds * score

        for _ in range(havoc_rounds):
            n_mutations = geometric_sample(max=128)
            mutant = entry.input
            for _ in range(n_mutations):
                mutant = apply_random_operator(mutant, queue)
            result = run(target, mutant)
            if has_new_coverage(result, global_bitmap):
                enqueue(queue, mutant)

        if use_splicing:
            for _ in range(SPLICE_CYCLES=15):
                other = random_choice(queue)
                spliced = splice(entry.input, other.input)
                # run havoc on spliced...

        if queue.cycle_complete():
            cull_queue(queue)
            global_bitmap.trim()

Seed Scheduling Heuristics

AFL's calculate_score() is worth understanding precisely. It produces a multiplier on the base havoc iteration count, starting at 100%, then adjusting:

Execution speed bonus: if this entry runs faster than average, score goes up
Bitmap contribution: if this entry was the first to trigger many unique edges, score goes up
File size penalty: larger inputs run slower and mutate less effectively; score goes down
Age penalty: entries that have been in the queue for many cycles without producing new finds get a slight penalty

The net effect: AFL naturally biases toward small, fast, high-coverage seeds. This is a greedy local heuristic, not a globally optimal schedule. AFLFast later modeled this as a Markov chain and showed that AFL over-invests in high-frequency paths at the expense of rare paths. AFL++ addresses this with alternative power schedules (explore, fast, coe, rare, mmopt).

Path Discovery Economics

Think of AFL as doing a biased random walk on the program's path space. Each time it discovers a new edge, it plants a flag there and starts exploring nearby. The problem is that the space is exponentially large — N edges means 2^N possible paths — but AFL avoids this by working at the edge level, not the path level.

The tuple representation is deliberately lossy. Two inputs that exercise the same edges in different global orders are considered equivalent. This is the engineering tradeoff: path explosion — the combinatorial nightmare that kills symbolic execution — is avoided by design.

Why Certain Mutations Are Surprisingly Effective

Bit flips at small widths are not as random as they look. Many format parsers gate on specific flag bits, version bytes, or type tags. Flipping a single bit at the right position can flip parser behavior from "discard" to "process deeply."

Interesting values (0, -1, INT_MAX, 128, 255) are disproportionately effective because programmers write conditionals like if (len > MAX) or if (offset == 0). These boundary values are exactly where off-by-one errors, integer overflows, and null pointer dereferences live.

Stacked havoc works because real vulnerabilities often require multiple conditions to hold simultaneously. A single mutation rarely achieves this. Stacked mutations approximate the combinatorial space without enumerating it.

Splice is underrated. Consider fuzzing a parser that handles two distinct record types. Neither seed alone reaches the code that handles type-2 records embedded inside type-1 structures. Splice can create a chimera that does.

03 —

Systems Engineering Decisions

⏱ ~20 min read Intermediate

The Bitmap — 64KB, Shared Memory, Not Bigger

AFL's trace bitmap is 64KB — chosen to match L2 cache size at the time of AFL's creation. This is not an accident. If the bitmap fits in L2 cache, the XOR-and-increment instrumentation stub runs from cached memory, not main RAM. The difference is ~100× in latency. At 1000 executions/second, cache miss rate for the bitmap would dominate total runtime.

The bitmap is a shared memory segment between the fuzzer process and the child process (via shmget/shmat). The child writes to it during execution; the fuzzer reads it after the child exits. No IPC overhead, no copying.

Bitmap Collision — The Hidden Tax

With 65,536 slots and potentially millions of edges in a large program, hash collisions are inevitable. Two distinct edges can map to the same bitmap slot — a false aliasing that makes the fuzzer think it has covered an edge it hasn't. AFL accepts this as a known tradeoff. AFL++ addresses it with PCGuard instrumentation, which assigns unique, collision-free IDs to each edge.

Queue Management

The queue is a flat file-backed linked list, not an in-memory priority queue. Queue entries are files in out/queue/ with names encoding metadata: id:000123,src:000042,op:havoc,rep:16. This naming is deliberate — it makes the queue's history human-readable and resumable across crashes or restarts.

"Culling" the queue: AFL periodically marks entries as "redundant" if their coverage is now fully subsumed by other entries. Redundant entries are not deleted — they stay on disk — but they're deprioritized in future cycles.

Calibration

Before fuzzing a seed, AFL runs it multiple times (default: 8 runs) and checks whether coverage is stable. If the bitmap changes across identical runs of the same input, the target has non-deterministic behavior — threading, ASLR-dependent code paths, time-dependent behavior. Unstable entries get their "variable" bits cleared from the global bitmap, preventing them from polluting coverage accounting.

Timeout Handling

AFL sets a timeout at 5× the calibrated execution speed, rounded up to 20ms minimum. Inputs that exceed this are classified as hangs, not crashes. They're saved separately and counted — an infinite loop in a parser is itself a bug (DoS). The aggressive timeout is intentional: "tarpits" that improve coverage by 1% while running 100× slower would otherwise cripple throughput.

Parallelization

AFL's parallelization model is embarrassingly simple: run multiple independent instances against the same target, sharing a queue via the filesystem. One instance is the "master" (-M), the rest are "secondary" (-S). Secondary instances periodically sync interesting finds from other instances into their own queue. No shared state beyond the filesystem.

Memory and CPU Considerations

Original AFL set a default memory limit of 50 MB on child processes via setrlimit, treating memory exhaustion as a crash — useful for catching allocator-based vulnerabilities. AFL++ removed the default limit (MEM_LIMIT=0) because ASAN's shadow memory causes false positives under a hard cap. When running without ASAN, supply -m 50 explicitly to restore the guard.

CPU frequency scaling is an enemy. Modern CPUs throttle clock speed for power efficiency. AFL checks for CPU frequency scaling at startup and warns loudly — inconsistent clock speed means inconsistent timeout calculations. Disable it: echo performance | sudo tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor.

04 —

Adversarial Critique — Where AFL Breaks

⏱ ~25 min read Advanced

Magic Bytes and Checksums

Many file formats begin with a magic signature: PNG\r\n\x1a\n, PK\x03\x04, \x7fELF. The parser checks this first and rejects input if it doesn't match. AFL's random mutations will destroy magic bytes with very high probability. The result: almost every mutated input is rejected at the first line of the parser.

Checksums are worse. A CRC or MD5 embedded in the input means any mutation that doesn't also update the checksum produces an input that the parser rejects before doing anything interesting.

Mitigations: Supply a dict file with magic bytes. Patch the target to disable checksum validation. Use AFL++ with laf-intel / CMPLOG which transforms multi-byte comparisons into sequences of single-byte comparisons, making them tractable.

Deep State Machines

Network protocol parsers are state machines. A valid TLS handshake requires: ClientHello → ServerHello → Certificate → ServerHelloDone → ClientKeyExchange → Finished — in order. A mutation of ClientHello that slightly malforms it causes the server to reject it before reaching any of the deeper states.

AFL has no concept of protocol state. AFLNet extends AFL specifically to handle this: it seeds with recorded packet captures, identifies server response codes to infer protocol state, and steers fuzzing toward states with low prior exploration.

Structured Inputs

A JSON parser has a grammar. An SQL parser has a grammar. AFL does not know the grammar. Its bit-level mutations will produce syntactically invalid inputs that are rejected before any interesting parsing begins. You can spend 48 hours fuzzing a JSON parser with AFL and never discover a bug that only manifests on semantically valid but structurally unusual input.

Why Hybrid Fuzzing Emerged

Coverage-guided fuzzing gets stuck at "hard branches" — comparisons that require a specific value to pass. if (magic == 0xDEADBEEF) — AFL's mutations have a 1-in-4-billion chance of randomly generating the right value.

Hybrid fuzzing (Driller, SymCC, QSYM) uses AFL for fast broad coverage and falls back to concolic execution only when AFL gets stuck. AFL is great at finding the first 80% of paths cheaply; concolic is the scalpel for the hard 20%.

AFL vs libFuzzer vs AFL++ vs honggfuzz

Fuzzer	Architecture	Strengths	Weaknesses	Best For
AFL (original)	Forkserver, file I/O	Proven, well-understood, any target	Bitmap collisions, limited customization	General use, legacy
libFuzzer	In-process, LLVM coverage	No fork overhead, sanitizer-native	In-process crashes destabilize fuzzer	Library fuzzing with harness
AFL++	Forkserver + LLVM PCGuard	Collision-free, CMPLOG, custom mutators	More complex setup	Most production fuzzing today
honggfuzz	Multi-mode (file/net/perf)	Hardware performance counters, persistent mode	Less ecosystem support	Performance-sensitive targets

Practitioner's Take

If you're starting new fuzzing work today, use AFL++. It incorporates a decade of research improvements while remaining operationally familiar. CMPLOG handles many magic-byte and checksum problems automatically. It's the right default.

05 —

Build Intuition — Concrete Examples

⏱ ~20 min read Intro

Fuzzing a CLI Parser (e.g., objdump, readelf)

These are ideal AFL targets. They take a file as input, parse it deterministically, and crash or behave incorrectly on malformed input. No network state, no magic byte problem (ELF/PE headers are short and can be in the seed corpus). AFL will immediately start making progress. Within hours, you'll typically find reads past buffer bounds, integer overflows in size calculations, and null pointer dereferences in malformed section handling.

Fuzzing a Network Protocol Parser

Much harder. AFL as-is will struggle because: the target is a server (not a file-reading CLI), it has network state, and input requires a valid sequence of messages. The standard approach:

Write a harness that reads "input" from a file, calls the protocol parsing function directly, and returns.
Stub out the network layer — replace recv() and send() with functions that read from/write to your harness buffer.
For stateful protocols: use AFLNet with packet capture seeds, or pre-initialize the parser state to a known mid-session state.

Fuzzing an Image Decoder (e.g., libpng, libjpeg)

Classic AFL target. Start with a small valid PNG (under 1KB). AFL will observe coverage, then start mutating. IHDR chunk width/height fields are integers — AFL's arithmetic mutations will try boundary values like 0, 1, UINT_MAX immediately. Corruption of IDAT chunk data exercises the decompressor.

Easy vs. Hard Bugs

Bug Class	AFL Ease	Reason
Stack overflow (unchecked length)	Easy	Coverage changes immediately when length boundary is crossed
NULL deref from malformed field	Easy	One mutation, one crash, simple
Integer overflow in allocation size	Medium	Requires specific value; interesting-value stage helps
Use-After-Free requiring specific alloc/free order	Medium-Hard	Requires stacked mutations + ASAN to detect
Type confusion behind checksum	Hard	Checksum blocks all mutations from reaching the type field
Logic bug with no crash signal	Hard	AFL only sees crashes and hangs; silent incorrect behavior is invisible
Crypto implementation error	Very Hard	No crash, no coverage change; requires differential fuzzing

The Sanitizer Imperative

Always compile your fuzz target with AddressSanitizer (-fsanitize=address). Without it, many memory corruption bugs produce no crash — they corrupt memory silently. ASAN makes heap overflows, stack overflows, UAFs, and use-after-return immediately crash with a precise diagnostic. Fuzzing without ASAN is leaving bugs invisible.

06 —

7-Day Hands-On Curriculum

⏱ 7 days lab work Lab

Day 1

Setup & Orientation

Install AFL++: apt install afl++ afl++-clang or build from source (preferred — read the Makefile)
Compile a toy target: readelf or nm from binutils. CC=afl-clang-fast ./configure && make
Run AFL for 30 minutes. Observe the UI. Learn every field: stability, map density, cycle progress, exec speed
Read AFL's status screen documentation cover to cover while it runs
Goal: understand what "map density 15%" means and why "stability 100%" matters

Day 2

Instrument a Toy Target

Write a deliberately vulnerable C parser: a 50-line function with at least one stack overflow, one integer overflow opportunity, and one NULL deref
Compile with afl-clang-fast + ASAN: AFL_USE_ASAN=1 afl-clang-fast -g vuln_parser.c -o vuln_parser
Examine the disassembly — find the instrumentation stubs AFL injected
Create two seed inputs: one minimal valid input, one that exercises a second code path
Run afl-fuzz. Time how long it takes to find the first crash

Day 3

Observe Coverage

Use afl-showmap to dump the coverage bitmap for a single input
Run afl-cov (or llvm-cov) to render HTML coverage over time
Compare coverage after 5 min vs 30 min vs 2 hours. Where did the fuzzer get stuck?
Add a "hard" branch: if (magic == 0xDEADBEEF). Observe time-to-pass. Then add to dict file and observe speedup
Bonus: run afl-analyze to see AFL's guess at which bytes are structural vs. data

Day 4

Trigger and Triage Crashes

Once crashes appear in out/crashes/, reproduce each one manually
Run under GDB or lldb. Get the exact fault address and stack trace
Run under ASAN — compare the ASAN report to the GDB crash
Use afl-tmin to minimize the crashing input: afl-tmin -i crash_input -o minimized -- ./vuln_parser @@
Classify each crash: stack overflow / heap overflow / NULL deref / UAF / other
Write a one-paragraph root cause analysis for each unique crash class

Day 5

Minimize Corpus

Run afl-cmin to find the minimal set: afl-cmin -i out/queue -o minimized_corpus -- ./vuln_parser @@
Count entries before and after. Ratio should be 5:1 to 20:1
Start a new AFL session seeded with the minimized corpus. Compare exec speed and coverage growth rate
Read afl-cmin source to understand how it uses afl-showmap internally

Day 6

Write a Real Harness

Pick a real library: libpng, zlib, or expat. Write a fuzzing harness that reads from file and calls the parsing function directly
Model: int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size)
Think carefully: what state should be initialized before each call? What resources need cleanup?
Run for 1 hour. Compare coverage with a naive "feed to CLI" approach
Document: what design decisions did you make? What did you stub out?

Day 7

Parallelize and Reflect

Run AFL++ in parallel: one master, two secondaries
Observe how secondary instances sync finds from the master. Watch queue growth across instances
Run afl-whatsup out/ to see aggregate stats across all instances
Write a post-mortem: what bugs did you find? What did AFL miss and why?
Stretch goal: enable AFL++ CMPLOG (-c 0) and observe its effect on magic-byte-protected paths

07 —

Code Reading Map

⏱ ~3 hr code reading Advanced

For AFL++ (the active codebase). Read in this order:

src/afl-fuzz.c
main()

The orchestrator. Understand the overall initialization sequence, option parsing, and how the main fuzzing loop is entered. Look for how the queue is initialized, how the forkserver is launched, and how the cycle counter increments. This is your map.

src/afl-fuzz-run.c
run_target()

The innermost loop: how a single input is sent to the forkserver, how the child PID is collected, how the timeout timer (ITIMER_REAL) is set, and how the trace bitmap is read after execution. This function runs millions of times — every cycle here matters.

src/afl-fuzz-bitmap.c
has_new_bits()

The coverage comparison function. Understand how the current trace bitmap is compared against the global "virgin" bitmap. How are new tuples detected? How are the hit-count buckets applied? This is the single most important function in AFL — everything else is scaffolding around it.

src/afl-fuzz-one.c
fuzz_one_original()

3000+ lines. The complete mutation pipeline for a single queue entry: calibration check, trimming, all deterministic stages, then havoc, then splice. Read the stage-by-stage structure first, then dive into individual mutation implementations.

src/afl-fuzz-queue.c
calculate_score()

The seed scheduling heuristic. See exactly how speed, size, bitmap contribution, and age factor into the havoc iteration multiplier. This is where AFL's resource allocation decisions are made.

instrumentation/
afl-compiler-rt.o.c

The runtime library injected into instrumented targets. Find __afl_trace() — this is the XOR-and-increment stub. Also find the forkserver loop (__afl_forkserver()) — the code that runs inside the target process, waiting for AFL to signal it to fork.

instrumentation/
afl-llvm-pass.so.cc

The LLVM instrumentation pass. Read how it iterates over basic blocks, assigns random IDs, and inserts the coverage call. Compare PCGuard (collision-free) vs. classic mode.

Reading Strategy

Don't start by reading linearly. Start with has_new_bits() because it's small (50 lines) and is the conceptual core. Then trace backwards: who calls it? That's common_fuzz_stuff() in run.c. Who calls that? fuzz_one_original(). Build your mental call graph bottom-up from the coverage primitive.

08 —

Modern Relevance

⏱ ~20 min read Intro

OSS-Fuzz — AFL at Scale

Google's OSS-Fuzz runs continuous fuzzing against 1000+ open source projects. It uses libFuzzer and AFL++ as primary engines, with ClusterFuzz as the orchestrator. The key insight of OSS-Fuzz: the bottleneck in fuzzing is not the fuzzer — it's the harness. OSS-Fuzz invested heavily in harness quality, then scaled horizontally.

As of 2025, OSS-Fuzz has found over 10,000 bugs in projects including OpenSSL, curl, FFmpeg, freetype, and dozens of critical infrastructure libraries.

Sanitizers — The Bug Amplifiers

Sanitizers transform silent corruption into loud crashes. They are not optional when fuzzing.

AddressSanitizer (ASAN): heap overflow, stack overflow, UAF, double-free. ~2× runtime cost. Always use for fuzzing.
UndefinedBehaviorSanitizer (UBSan): signed integer overflow, null pointer dereference, misaligned access. Minimal overhead.
MemorySanitizer (MSan): use of uninitialized memory. Cannot combine with ASAN. ~3× overhead but catches info-leak primitives.
ThreadSanitizer (TSan): data races. For concurrent targets only. Very expensive (~10×).

Fuzzing in CI Pipelines

Continuous fuzzing: Run OSS-Fuzz or an equivalent continuously against main branch. Findings are filed as security bugs automatically.
Regression fuzzing: Maintain a corpus of known-crash inputs in your repo. In CI, replay every crash input against every build.
Coverage gating: Run a short (5-minute) fuzzing session per PR. If new code has zero fuzzing coverage, fail the CI check.

Fuzzing in Modern Secure SDLC

Design phase: Identify parser surface area. Anything that reads untrusted input is a fuzzing target. Mark it in threat model.
Implementation phase: Developers write initial harnesses alongside the parsing code.
Test phase: Run fuzzing for 24–72 hours before each release.
Response phase: Crash corpus is maintained and replayed in every CI build. CVE inputs are added to the permanent regression corpus.

09 —

Destroy Your Misconceptions

⏱ ~15 min read Intro

If AFL is running and not crashing, the software is secure.

AFL finding no crashes means: AFL found no crashes that it could detect, with the seeds it had, in the time it ran, against the code paths it could reach, with the sanitizers you enabled. These are five independent failure modes. AFL is a partial oracle, not a verifier. It proves the presence of bugs, not the absence.

Fuzzing is something you do once before a release.

Bugs found after one hour of fuzzing are almost never the interesting ones. The interesting bugs emerge after 24–72 hours when AFL has exhausted the shallow paths. More importantly, bugs re-introduced by code changes require continuous fuzzing to catch.

More random mutations = more coverage = more bugs.

Coverage is the metric for how well you're exploring the code. Bugs live in specific code paths, not in coverage percentages. A fuzzer with 95% branch coverage that never exercises the cryptographic validation path will miss every bug in it. Targeted fuzzing beats raw coverage chasing.

AFL++ is just AFL with more features. They're equivalent.

AFL++ is architecturally different in ways that matter: collision-free PCGuard instrumentation means your coverage bitmap is actually accurate. CMPLOG/REDQUEEN solves an entire class of problems (magic bytes, checksums) that AFL classic cannot handle. For new work, there is no reason to use vanilla AFL over AFL++.

Crashes found by AFL are automatically valid security bugs.

Many AFL crashes are: benign assertion failures, intentional abort() on invalid state, crashes that only occur with ASAN's shadow memory, or unexploitable reads from a controlled offset. Triage is required. A crash is the beginning of analysis, not the end.

Fuzzing only finds memory corruption bugs.

Fuzzing with sanitizers finds memory bugs. Fuzzing with differential oracles finds logic bugs, incorrect parsing behaviors, and protocol implementation divergences. Fuzzing with custom assertion checks finds application-level invariant violations. The oracle is the constraint; the fuzzer is the explorer.

Larger seed files give AFL more material to work with, so they're better.

Wrong — this is in AFL's own documentation. Larger inputs run slower, and mutations are more likely to be semantically irrelevant. AFL explicitly penalizes large inputs in its performance score. Under 1KB is ideal. Start small, let AFL grow the corpus from there.

10 —

30-Day Practice Roadmap

⏱ 30 days lab Full Program

This is not a reading list. It is an action plan. Each week has a concrete deliverable.

Week 1
Days 1–7

Environment → First Real Find

Execute the 7-day curriculum above. Do not skip Day 2 (write your own vulnerable target). The exercise of writing a bug you then find is the single best way to understand what AFL actually detects. Deliverable: one minimized crash with a one-page root cause analysis.

Week 2
Days 8–14

Real Target + Real Harness

Pick a real open-source library: libpng, zlib, expat, or libtiff. Write a harness from scratch — do not copy an existing one. Run for 48 hours across 4 parallel instances. Generate a coverage report with lcov/llvm-cov. Deliverable: a written harness with inline comments explaining every design decision, plus coverage report.

Week 3
Days 15–21

Source Code Reading

Spend this week reading AFL++ source. Work through the 7 key files listed in Section 7. For each file: read it, write a one-paragraph description of what it does, identify the 3 most important functions. Deliverable: annotated reading notes, ideally committed to your own fork of AFL++.

Week 4
Days 22–30

Hard Problems + Synthesis

Take a target with a checksum or magic-byte barrier. Run AFL++ with CMPLOG enabled. Write a custom mutator (Python AFL++ API) for a structured format. Run a differential fuzzing experiment. Deliverable: a 3-page technical write-up: "What AFL found, what it missed, and what I would do differently."

The Real Challenge

The roadmap above will teach you AFL. But the pattern you need to break is stopping at category-level understanding. "I know how AFL works" is not the same as "I have found 3 bugs in real code using AFL." The difference between those two statements is 30 days of the above work, done without shortcuts.

Do the lab. Write the harness. Find the crash. Write the root cause. That is the loop. Everything else is commentary.

REF —

Primary Sources & Further Reading

Primary Technical Sources

Zalewski — technical_details.txt — The canonical whitepaper. Every serious AFL practitioner should read this in full. Covers the bitmap, instrumentation, mutation stages, and design rationale from the original author.
AFL++ Source Repository — The active codebase. Pair with Section 7 (Code Reading Map) to navigate it.
AFL++ — fuzzing_in_depth.md — The most complete operational guide for AFL++. Covers parallelization, persistent mode, CMPLOG, and corpus management in depth.

Research Papers

Böhme et al. — Coverage-Based Greybox Fuzzing as Markov Chain (CCS 2016) — The AFLFast paper. Formally models AFL's seed scheduling as a Markov chain and demonstrates that AFL over-invests in high-frequency paths. Foundational for understanding why AFL++ introduced alternative power schedules.
Lemieux & Sen — FairFuzz: A Targeted Mutation Strategy (ASE 2018) — Demonstrates that bias toward rare branches significantly improves coverage. Directly influenced AFL++'s mutation strategies.
Aschermann et al. — REDQUEEN: Fuzzing with Input-to-State Correspondence (NDSS 2019) — The paper behind AFL++'s CMPLOG feature. Shows how to solve magic-byte and checksum barriers without concolic execution by tracking the correspondence between input bytes and comparison operands.
Fioraldi et al. — AFL++: Combining Incremental Steps of Fuzzing Research (WOOT 2020) — The AFL++ design paper. Describes PCGuard instrumentation, CMPLOG integration, and the modular architecture.
Böhme et al. — Boosting Fuzzer Efficiency: An Information Theoretic Perspective (TOSEM 2020) — Theoretical framing of coverage-guided fuzzing as information maximization. Useful if you want to understand why the greybox model works mathematically.

Video Lectures & Talks

Yan Shoshitaishvili — Fuzzing & Software Security (ASU CS 4678) — University lecture series. Clear and detailed; covers AFL internals, harness writing, and triage from an academic perspective.
AFL++ team — Modern Fuzzing of C/C++ Projects (Hack.lu 2020) — Practical walkthrough of AFL++ features from the maintainers: CMPLOG, persistent mode, custom mutators.
Brandon Falk — Fuzzing for Correctness (Enigma 2021) — Covers going beyond crash-finding: differential fuzzing, correctness oracles, and fuzzing closed-source targets.
Google Security — Introduction to OSS-Fuzz — How Google's continuous fuzzing infrastructure works, how to onboard a project, and the operational lessons from running at scale.

Practical References

OSS-Fuzz Documentation — How to integrate your open-source project into Google's continuous fuzzing infrastructure. Includes harness examples for C/C++, Rust, Python, and Go.
Google Fuzzing Tutorial — Step-by-step harness writing guide. Covers libFuzzer, but the harness patterns translate directly to AFL++.
Chromium Fuzzing Guide — Production-grade documentation from the team that has likely found more real-world bugs with fuzzing than anyone else.
lcamtuf's blog — Zalewski's posts on AFL findings and design decisions. Provides intuition that the technical documentation alone doesn't give.
AFL++ LTO Instrumentation README — Deep dive into LTO mode and PCGuard. Read when you need to understand the collision-free instrumentation architecture.

American Fuzzy Lop —A Technical Deep Dive

Mental Model — From First Principles

What Problem Did Fuzzing Solve, and Why Was AFL a Step-Change?

Coverage-Guided Greybox Fuzzing — The Right Mental Frame

Edge Coverage — Why Edges, Not Blocks?

Instrumentation — Two Modes

The Forkserver — Why It Matters

Corpus Minimization

Deterministic Mutation Stages

Havoc Stage

Splice Stage

Crash Triage

Algorithmic Deep Dive

The AFL Execution Loop — Step by Step

Pseudocode

Seed Scheduling Heuristics

Path Discovery Economics

Why Certain Mutations Are Surprisingly Effective

Systems Engineering Decisions

The Bitmap — 64KB, Shared Memory, Not Bigger

Queue Management

Calibration

Timeout Handling

Parallelization

Memory and CPU Considerations

Adversarial Critique — Where AFL Breaks

Magic Bytes and Checksums

Deep State Machines

Structured Inputs

Why Hybrid Fuzzing Emerged

AFL vs libFuzzer vs AFL++ vs honggfuzz

Build Intuition — Concrete Examples

Fuzzing a CLI Parser (e.g., objdump, readelf)

Fuzzing a Network Protocol Parser

Fuzzing an Image Decoder (e.g., libpng, libjpeg)

Easy vs. Hard Bugs

7-Day Hands-On Curriculum

Code Reading Map

Modern Relevance

OSS-Fuzz — AFL at Scale

Sanitizers — The Bug Amplifiers

Fuzzing in CI Pipelines

Fuzzing in Modern Secure SDLC

Destroy Your Misconceptions

30-Day Practice Roadmap

Primary Sources & Further Reading

Primary Technical Sources

Research Papers

Video Lectures & Talks

Practical References

American Fuzzy Lop —
A Technical Deep Dive