Fix for #1882, MapFeedback gives incorrect count
I think this should do what is needed to fix #1882... Haven't fully tested it though (just ran the small example in forkserver_libafl_cc with some extra logs to check it lines up)
Please use the hashbrown HashSet to be no_std compatible.
Also, I wonder how this might impact performance. It's only run for new/interesting testcases so it may not be too bad, but maybe there's an even more performant way to build this (open question)?
Switched to hashbrown now, had originally just grep'd for HashSet and used the the first implementation I saw in the codebase sorry.
I think you could improve the speed a lot by keeping a HashSet of all the discovered edges in MapFeedbackMetadata; then you can just insert all of the novelties into that set each time a new entry is discovered, then get the len and you should have it all populated. Of course you're paying slightly in terms of memory and code complexity. depends how hot this runs; do you have any profiling data?
I think this might be a solution for the wrong issue. The real issue here is that we update the global metadata in is_interesting (which has gotten us into trouble in the past, when conditional map feedback would cause the discarding of new inputs but keep their edge data); shouldn't we simply add the manager as a parameter in the append_metadata here, so that we can update the global score at the time of the actual update? This would also reduce the relatively expensive cost of the recording of new edges, more than we already are.
Just to be clear, the value this gives is still wrong if the map feedback's metadata is discarded.
@addisoncrump I agree that moving the update into append_metadata is a better solution; and we are already iterating the history_map once there IIRC - so we can save one full iteration of whole map. I've never heard about conditional map feedback though.
Ultimately you and the other maintainers are best placed to make a call on an API change (to add manager into append_metadata parameters).
I've never heard about conditional map feedback though.
Suppose you want to only deduplicate crashes by whether they explore new code regions:
let objective = feedback_and!(CrashFeedback::new(), MapFeedback::new(&edges_observer));
Here, MapFeedback gets called with is_interesting, but never append_metadata, so as a result the edges are not committed back to the history map, but they are reported as increased by this implementation. You could of course use feedback_and_fast here but I've been in a few situations where I needed to check the map before an even more expensive operation.
Maybe it's a philosophical question, but if hitting a particular edge always causes a crash, should we not count that edge as covered?
I would argue no -- it shouldn't be in the history map if no input in the corpus would trigger that edge.
shouldn't we simply add the manager as a parameter in the append_metadata here, so that we can update the global score at the time of the actual update?
i think it is good
I made the changes in #1936
Can you take a look if this fixes your problem? @DanBlackwell