trident icon indicating copy to clipboard operation
trident copied to clipboard

Optionaly show stats during fuzzing session

Open lukacan opened this issue 1 year ago • 2 comments

Currently, it is not possible to determine (or somehow debug) whether the fuzzing session is actually fuzzing something and if the instructions are successfully executed. This means that based solely on the fuzzer's output and zero crashes, the user cannot tell if their program has no errors or if none of the targeted instructions were successfully executed.

This PR introduces a simple stats logging mechanism.

The best option would be to show stats for the fuzzing session right after the session ends. However, the honggfuzz workflow is as follows:

  1. In the main function, call fuzz_threadsStart.
  2. In the fuzz_threadsStart function, call fuzz_threadNew.
  3. Next, inside fuzz_threadNew, call fuzz_fuzzLoop, which, I guess, starts a new subprocess for the fuzz target, and we can see an infinite loop also in the fuzz_threadNew function.
  4. Lastly, we can see that the subprocess is forcefully closed if conditions are met, let's say the max number of iterations was hit.

Due to the SIGKILL, I think it is not possible to time the end of a fuzzing subprocess (fuzzing session) = we do not exactly know when to output the stats.

So, this PR aims to provide stats during the whole fuzzing session. A new function called run_fuzzer_with_stats inside the commander is created.

The stats collect an accumulating number of invocations and successful invocations of corresponding instructions. Then, at the end of a fuzzing iteration, stats are printed. If every instruction was successfully executed, the whole structure with a success message is printed; on the other hand, if any instruction has 0 successful invocations, an error message is printed.

This means as the underlying fuzzer tries to explore as many branches as possible, this can lead to two scenarios:

  1. A lot of success outputs at the end as it converges to exploring all possible branches.
  2. A lot of error outputs at the end as it cannot explore the code any further.

Lastly, we have two options for indicating to the fuzzer that we want to see stats.

  • Use a cfg flag (similar to how fuzzing is utilized by honggfuzz).
  • Set our environment variable and evaluate it during execution.

I find the second option better because it has no impact on performance and for the first option, we can use:

Command::new("cargo")
            .env("HFUZZ_RUN_ARGS", fuzz_args)
            .env("CARGO_TARGET_DIR", cargo_target_dir)
            .env("HFUZZ_WORKSPACE", hfuzz_workspace)
            // tell fuzzer to output stats
            .env("RUSTFLAGS", "--cfg fuzzing_with_stats")
            .arg("hfuzz")
            .arg("run")
            .arg(target)
            .spawn()?;

However, this approach results in a full re-compilation even if trident fuzz run <FUZZ_TARGET> was already called.

NOTE: regarding the text_generator.rs changes. The cargo clippy had problem with (used) unused variable SUCCESS within the lib.rs. So I just changed FINISH -> SUCCESS.

lukacan avatar Mar 13 '24 18:03 lukacan

Yes, so I updated the feature implementation. Essentially:

  • The stdout of the cargo hfuzz command is marked as piped and is parsed within Trident after the fuzzing session.
  • During the fuzzing session, invoked/successfully invoked instructions are harvested into a HashMap and then serialized into JSON format.
    • The accumulation process is no longer present, as we need to output the stats in every iteration so that the stats can be summed up in the Trident after.
  • This JSON-formatted output is then collected inside Trident at the end of the session and parsed back into a HashMap (line by line).
  • Lastly, the HashMap is displayed (as table), containing the statistics.

Points to note:

  • The cargo hfuzz stdout is piped, not stderr. This is because redirecting stderr would also redirect the compilation output, which is undesirable.

lukacan avatar Mar 17 '24 22:03 lukacan

@Ikrk Ready for review

lukacan avatar Mar 22 '24 14:03 lukacan

I added also stats of failed invariants checks and I created a new Jira task to potentially extend the stats also with unhandled panics: https://ackeeblockchain.atlassian.net/browse/TRD-81

Ikrk avatar Jun 07 '24 10:06 Ikrk