LibAFL Fix for #1882, MapFeedback gives incorrect count

I think this should do what is needed to fix #1882... Haven't fully tested it though (just ran the small example in forkserver_libafl_cc with some extra logs to check it lines up)

Feb 26 '24 13:02 DanBlackwell

Please use the hashbrown HashSet to be no_std compatible. Also, I wonder how this might impact performance. It's only run for new/interesting testcases so it may not be too bad, but maybe there's an even more performant way to build this (open question)?

Feb 26 '24 15:02 domenukk

Switched to hashbrown now, had originally just grep'd for HashSet and used the the first implementation I saw in the codebase sorry.

I think you could improve the speed a lot by keeping a HashSet of all the discovered edges in MapFeedbackMetadata; then you can just insert all of the novelties into that set each time a new entry is discovered, then get the len and you should have it all populated. Of course you're paying slightly in terms of memory and code complexity. depends how hot this runs; do you have any profiling data?

Feb 26 '24 16:02 DanBlackwell

I think this might be a solution for the wrong issue. The real issue here is that we update the global metadata in is_interesting (which has gotten us into trouble in the past, when conditional map feedback would cause the discarding of new inputs but keep their edge data); shouldn't we simply add the manager as a parameter in the append_metadata here, so that we can update the global score at the time of the actual update? This would also reduce the relatively expensive cost of the recording of new edges, more than we already are.

Feb 27 '24 12:02 addisoncrump

Just to be clear, the value this gives is still wrong if the map feedback's metadata is discarded.

Feb 27 '24 12:02 addisoncrump

@addisoncrump I agree that moving the update into append_metadata is a better solution; and we are already iterating the history_map once there IIRC - so we can save one full iteration of whole map. I've never heard about conditional map feedback though.

Ultimately you and the other maintainers are best placed to make a call on an API change (to add manager into append_metadata parameters).

Feb 27 '24 13:02 DanBlackwell

I've never heard about conditional map feedback though.

Suppose you want to only deduplicate crashes by whether they explore new code regions:

let objective = feedback_and!(CrashFeedback::new(), MapFeedback::new(&edges_observer));

Here, MapFeedback gets called with is_interesting, but never append_metadata, so as a result the edges are not committed back to the history map, but they are reported as increased by this implementation. You could of course use feedback_and_fast here but I've been in a few situations where I needed to check the map before an even more expensive operation.

Feb 27 '24 15:02 addisoncrump

Maybe it's a philosophical question, but if hitting a particular edge always causes a crash, should we not count that edge as covered?

Feb 29 '24 12:02 DanBlackwell

I would argue no -- it shouldn't be in the history map if no input in the corpus would trigger that edge.

Feb 29 '24 13:02 addisoncrump

shouldn't we simply add the manager as a parameter in the append_metadata here, so that we can update the global score at the time of the actual update?

i think it is good

Mar 05 '24 15:03 tokatoka

I made the changes in #1936

Can you take a look if this fixes your problem? @DanBlackwell

Mar 13 '24 20:03 tokatoka