LibAFL Proposed projects

In this issue, I proposed several projects based on libafl (like libafl_frida) that we would be glad to include here. As LibAFL is newly born there is a lot of work on the lib itself and we mostly work only on it, so we are seeking help for these projects.

[ ] Start rewriting AFL++'s afl-fuzz in Rust as a frontend of LibAFL. We aim to be compatible with the current C implementation. The core logic is already in LibAFL, but the rewriting is a not trivial software engineering task.
- Outcome: the implementation, even if not with feature parity with AFL++
- Skills: Rust, C, system programming, software engineering
- Difficulty: medium
- Possible mentors: @vanhauser-thc @andreafioraldi @domenukk
- Length: 350 hours
[ ] Extend Forkserver (#111) to work on windows including https://github.com/sslab-gatech/winnie/tree/master/forklib from Winnie in libafl_targets
- Outcome: the implementation and a set of working examples combined with SanCov and libafl_frida
- Skills: Rust, C, system programming, win32 development
- Difficulty: medium
- Possible mentors: @domenukk @andreafioraldi @tokatoka
- Length: 175 hours
[ ] Frida ASan and CmpLog for Windows and other architectures (arm, arm64, x86, x86_64). Most of the code can be ported from C (https://github.com/AFLplusplus/AFLplusplus/tree/stable/frida_mode) to Rust.
- Outcome: the implementation and a set of working examples
- Skills: Rust, C, assembly, system programming, win32 development
- Difficulty: medium
- Possible mentors: @tokatoka @domenukk
- Length: 175 hours
[ ] Injectable libafl_frida into running targets + Javascript API support for libafl_frida and libafl_sugar. The work may involve patches and contributions to the Frida's Rust bindings https://github.com/frida/frida-rust
- Outcome: the implementation and a set of working examples
- Skills: Rust, C, assembly, system programming, win32 development
- Difficulty: medium
- Possible mentors: @andreafioraldi @tokatoka
- Length: 175 hours
[ ] Implement syscall emulation for filesystem and network in libafl_qemu. The student must implement something similar to preeny to hook the network API and an emulator filesystem that can be snapshot-restored always hooking the syscall in libafl_qemu user mode
- Outcome: the implementation and a set of working examples
- Skills: Rust, C, system programming
- Difficulty: medium
- Possible mentors: @andreafioraldi @domenukk @rmalmain
- Length: 175 hours
[ ] Implement the Pangolin mutator (https://wcventure.github.io/FuzzingPaper/Paper/SP20_PANGOLIN.pdf) on top of the existing concolic execution API
- Outcome: the implementation and a set of working examples
- Skills: Rust, C++, experience in symbolic execution
- Difficulty: medium
- Possible mentors: @domenukk @vanhauser-thc @addisoncrump
- Length: 175 hours
[ ] LibAFL Workers / RemoteWorkerLauncherStage + RemoteWorkerCollectorStage. The details are in #293
- Outcome: the implementation and a set of working examples
- Skills: Rust
- Difficulty: medium
- Possible mentors: @domenukk @tokatoka @addisoncrump
- Length: 175 hours
[ ] Implement AFLGo Implement the AFLGo directed fuzzer https://github.com/aflgo/aflgo
- Outcome: the implementation and a set of working examples
- Skills: Rust, C, C++, LLVM
- Difficulty: medium
- Possible mentors: @andreafioraldi @domenukk @tokatoka @addisoncrump
- Length: 175 hours
[ ] Create a libafl qemu based clone of afl-qemu-trace to be used in AFL++
- Outcome: a feature equivalent clone of afl-qemu-trace with all the supported env vars
- Skills: Rust, C
- Difficulty: medium
- Possible mentors: @andreafioraldi @rmalmain @vanhauser-thc
- Length: 175 hours
[ ] Adapt kAFL / Nyx to LibAFL QEMU. For now, LibAFL QEMU supports emulation for both user-mode and system-mode. We would like to fully integrate hypervisor-based fuzzing to LibAFL QEMU, with an up-to-date kernel module and integration with the current implementation (snapshotting, etc.).
- Outcome: the implementation, even if not with feature parity with AFL++
- Skills: Rust, C, system programming, software engineering, linux kernel
- Difficulty: medium
- Possible mentors: @rmalmain
- Length: 350 hours

Then, if you want to implement any of the recent fuzzing techniques (https://wcventure.github.io/FuzzingPaper/ can be useful) feel free to ping us in order to know if we are already implementing the technique that you are interested in or not.

May 22 '21 10:05 andreafioraldi

see #551 for full system libafl_qemu

Feb 22 '22 17:02 bitwave

I'm interested in the project Bridge LibAFL And Nyx, can someone provide more materials? It seems that libnyx has not been actively maintained 😂

Mar 31 '22 08:03 syheliel

@syheliel it is actively maintained. it was integrated into afl++ just 1-2 months ago by sergej, one of the two authors. check out the nyx stuff in afl++. otherwise sergej is very helpful if you have questions.

Mar 31 '22 08:03 vanhauser-thc

@vanhauser-thc thanks!

Mar 31 '22 09:03 syheliel

Highly interested in the project related to PANGOLIN mutator. I am curious to know what other recent academic research projects are being integrated into AFL++ at this moment.

Can you please tell me what you use for communication (irc or slack)?

Apr 02 '22 21:04 imranur-rahman

hello, we are on this channel https://github.com/AFLplusplus/AFLplusplus/issues/681

Apr 03 '22 01:04 tokatoka

hello, I am highly interested in contributing to the project. Please guide me through some basic starting steps.

Sep 06 '22 13:09 Kanaintmeandyo

Hi, I'd just like to let you know that we are currently working on a unicorn based approach that will also make it possible to emulate multiple processes or threads in parallel. It is based on a custom kernel implementation that delegates all I/O to system components to which the fuzzer can individually supply inputs (i.e. the fuzzer acts as a stand-in for the actual component). This also makes multi-input-stream fuzzing possible (e.g. fuzzing a tcp stream in conjunction with udp inputs. This is relevant to for example RTSP). The whole thing also implements copy on write, enabling the fuzzer to jump back to specific points to make stateful fuzzing easier.

Since this is a huge project however, it will take some more time as currently only a student of mine and I are working on it. Once we have a somewhat working version ready with a few of the basic pieces, we will most likely open source it and also be willing to integrate at least parts of it into libafl.

I would be interested in hearing if there is any overlap with already existing developments and parts in libAFL, since it is difficult keeping an overview over all the new stuff being published.

Nov 02 '23 12:11 mlgiraud

@mlgiraud you've probably already seen #1617 and #913 -- these are probably closest to what you're doing. It may also be possible to avoid Unicorn/custom kernel entirely by developing a libafl kernel module (just use no_std and it should just work, though you will need an allocator) and intercepting these calls either by intercepting the system calls themselves, or by wrapping the creation of targeted sources of input by way of opening character devices defined by the kernel module (by e.g. intercepting with LD_PRELOAD or source code modification). Hope this helps!

Nov 07 '23 19:11 addisoncrump

@addisoncrump Yeah i saw your multipart input PR. I think it is not 100% suited for my use case but i will most likely adapt a few ideas from there. This will become clearer in the future when everything becomes more stable on our side. I'm not quite sure how #913 relates to my work though. Could you elaborate on why you think this might be relevant?

Regarding unicorn: We are using full CPU emulation, since we want to be able to emulate different architectures (e.g. ARM on x86). Of course an approach that skips emulation will be faster, but not as flexible, but that is not our goal here. The nice benefit we get here is also that we can decide which process to schedule, making it possible to e.g. further investigate concurrency bug detection via fuzzing (at least that's what im hoping for ;))

Nov 08 '23 08:11 mlgiraud

#913 is relevant because it allows for the fuzz testing of programs which process sequences of inputs and buffer. For example, fuzzing a remote network target where each "input" (probably a higher-level packet of some sort) has a response. Doesn't make a ton of sense to send, wait, receive when you can repeatedly send and then asynchronously receive feedback. Since you're dead-set on using Unicorn, this probably doesn't make sense for your use case as I originally believed.

Nov 09 '23 12:11 addisoncrump

Hi, I would like to take up the ideas for GSOC, what is the procedure for the same? Any contributing guidelines? How to contact the mentor and get your proposal reviewed?

Mar 03 '24 11:03 kd1729

Ideally candidates will work on github issues before the proposal deadline to show us their engineering skills, and talk to us about which projects they are interested in. Then we decide on candidates according to how confident we are they will be able to finish projects successfully. Happy hacking :)

Mar 04 '24 22:03 domenukk

Any new ideas for 2024?

Mar 19 '24 09:03 maxammann