LibAFL | Babies first coverage-guided fuzzer

󰃭 2024-12-30

I recently got the chance to give a talk at work titled fuzzing with LibAFL and finding 0days fast (hopefully). My goal by the end of this is to guide you through building your first fuzzer using LibAFL in a whitebox setting. We will be fuzzing an example C program I built and statically linking the function of interest into our fuzzer.

Explaining the build system

For context I built a cargo project using cargo new and I have the following files in the src directory. We have a set of files named: lib.rs, Makefile, program.c and fuzzer.
lib.rs Is our file holding our code containing the LibAFL fuzzer code.
Makefile Is our build system utilizing clang to statically link our fuzzer and vulnerable program together.
program.c Is, you gussed it… Our vulnerable program!
fuzzer This is the executable which is statically linked with our fuzzer code and the vulnerable program code so that we can focus on fuzzing a specific function inside of the vulnerable program.

Coverage Guided Fuzzing

Coverage guided fuzzing focuses on maximizing code coverage by tracking unique code paths triggered by each input. The goal is to discover new paths and dive deeper into the code by mutating inputs that yield previously unseen coverage. This is typically achieved through binary instrumentation, such as using -fsanitize-coverage=trace-pc-guard in LLVM Clang. This flag inserts a call at the end of each basic block, signaling when the block has been executed. This feedback allows the fuzzer to prioritize inputs that lead to new code paths.

Step one: Building the harness

The following steps will be from within lib.rs.
Here we build a harness first importing the function we want to fuzz from the program since we’re statically linking. Our harness takes a BytesInput this is a standard input type in LibAFL, making custom input types isn’t too difficult as long as you understand LibAFL internals.
We seperate the input into 2 chunks, the magic and the payload and then we send our input via the vuln_func imported function.

extern "C" {
    fn vuln_func(magic: *const u8, payload: *const u8, payload_len: u32) -> bool;
}

fn fuzz() {
    let mut harness = |input: &BytesInput| {
        let total_len = input.bytes().len();
        if total_len <= 4 {
            return ExitKind::Ok;
        }

        let magic = &input.bytes()[..3];
        let payload = &input.bytes()[3..];

        unsafe {
            vuln_func(magic.as_ptr(), payload.as_ptr(), payload.len() as u32);
                ExitKind::Ok
        }
    };
...

Step two: Monitor and EventManager

Monitor keeps track of all the client’s monitor, and offers methods to display them.
EventManager manages all events that go to other instances of the fuzzer. The messages are commonly information about new Test cases as well as stats and other Events.

...
    let monitor = SimpleMonitor::new(|s| println!("{s}"));
    let (_, mut restarting_mgr) =
        match setup_restarting_mgr_std(monitor, 0x539, EventConfig::AlwaysUnique) {
            Ok(res) => res,
            Err(err) => match err {
                Error::ShuttingDown => {
                    return
                }
                _ => {
                    panic!("[-] Failed to setup the restarter: {err}")
                }
            },
        };
...

Step three : Observer and Feedback

Observers work together with feedbacks in order to track or detect “interesting” things from the target program, observers gather information from the target and are used in feedbacks so that the feedbacks can determine whether or not that information is truly interesting.
We have two interesting variables feedback and objective which will both get passed to the state. Think of the feedback as the data to help the fuzzer and the objective like goals for the fuzzer to find.

    let edges_observer = unsafe {
        HitcountsMapObserver::new(
            StdMapObserver::from_mut_ptr(
                "edges",
                EDGES_MAP.as_mut_ptr(),
                MAX_EDGES_FOUND)).track_indices()
    };

    let time_observer = TimeObserver::new("Time");
    let map_feedback = MaxMapFeedback::new(&edges_observer);
    let time_feedback = TimeFeedback::new(&time_observer);

    let mut feedback = feedback_or!(
        map_feedback,
        time_feedback
    );

    let mut objective = feedback_or_fast!(CrashFeedback::new(), TimeoutFeedback::new());

Step four : State

The state is kind of like the heart of the fuzzer, holding everything including the RNG, corpora, feedback and objective.
The nice thing about the state is that it can be serialized and saved, we wont be doing that in this blog post but… it’s possible :p

    let mut state = state.unwrap_or_else(|| {
        StdState::new(
            StdRand::new(),
            InMemoryCorpus::new(),
            OnDiskCorpus::new("./crashes").unwrap(),
            &mut feedback,
            &mut objective,
        )
        .unwrap()
    });

Step five : Mutator and Scheduler

The mutator is responsible for randomizing inputs according to the selected mutator, in this case we’re using havoc which has been a standard mutator used within the AFL suite for a very long time and has preformed historically well.
The scheduler defines how the fuzzer requests a test case from the corpus. It has hooks to corpus add/replace/remove to allow complex scheduling algorithms to collect data.

    let mutator = StdScheduledMutator::new(havoc_mutations());

    let scheduler = RandScheduler::new();

Step six : Fuzzer and Executor

The Fuzzer is well… the fuzzer which is basically a queue for sending inputs up the chain to be ran by the executor.
The Executor is responsible for running the target with our input and since we’re statically linking everything we’re going to be able to do everything in-process which signfically boosts our fuzzer speeds, generally if possible it’s better to approach fuzzing like how we’re doing it here (fuzzing in-process).

    let mut fuzzer = StdFuzzer::new(scheduler, feedback, objective);

    let mut executor = InProcessExecutor::with_timeout(
        &mut harness,
        tuple_list!(edges_observer, time_observer),
        &mut fuzzer,
        &mut state,
        &mut restarting_mgr,
        Duration::new(10, 0),
    ).unwrap();

Step seven : Generator and Stage

The Generator is responible for generating your inputs.
The Stage defines individual steps in a fuzzer’s execution pipeline.

    let mut generator = RandBytesGenerator::new(NonZero::new(1024).unwrap());
    if state.must_load_initial_inputs() {
        state.generate_initial_inputs_forced(&mut fuzzer, &mut executor, &mut generator, &mut restarting_mgr, 8).unwrap();
    }

    let stages = StdMutationalStage::new(mutator);

Step eight : Running The Fuzzer

Starting the fuzzer: We could alternatively use fuzz_loop to fuzz infinitely, however we don’t need to in this case for this case is simply an example so instead we use fuzz_loop_for with a million iterations
We specify no_mangle because we don’t want the symbols in the statically linked target program to be mangled so that we can use them to fuzz.

...
    let iters = 1_000_000;
    fuzzer.fuzz_loop_for(
        &mut tuple_list!(stages),
        &mut executor,
        &mut state,
        &mut restarting_mgr,
        iters,
    ).unwrap();
}

#[no_mangle]
pub extern "C" fn fuzzer_main() {
    SimpleStdoutLogger::set_logger().unwrap();
    log::set_max_level(log::LevelFilter::Trace);
    fuzz();
}