rustworkx icon indicating copy to clipboard operation
rustworkx copied to clipboard

Swap `ahash` with `foldhash`

Open IvanIsCoding opened this issue 1 year ago • 5 comments

This matches hashbrown's 0.15 release that switches the default hasher. This is more to keep consistency with the internals.

I will put this on hold until #1293 is merged, we are still stuck with an older version of hashbrown until we update PyO3 anyway

IvanIsCoding avatar Oct 13 '24 13:10 IvanIsCoding

This fails MSRV, I guess hashbrown will be stuck as well

IvanIsCoding avatar Oct 13 '24 16:10 IvanIsCoding

I used to use hashbrown for my own code but stopped when I noticed this on their github repo:

Since Rust 1.36, this is now the HashMap implementation for the Rust standard library. However you may still want to use this crate instead since it works in environments without std, such as embedded systems and kernels.

Maybe there have been additional features in hashbrown that make it worth using now but I'm curious is the difference is measurable.

jamestwebber avatar Oct 14 '24 17:10 jamestwebber

I used to use hashbrown for my own code but stopped when I noticed this on their github repo:

Since Rust 1.36, this is now the HashMap implementation for the Rust standard library. However you may still want to use this crate instead since it works in environments without std, such as embedded systems and kernels.

Maybe there have been additional features in hashbrown that make it worth using now but I'm curious is the difference is measurable.

That is a fair point. It is not strictly necessary, but I feel hashbrown is always trying to push a little bit above the standard library in terms of API and performance.

Also, it is a transitive dependency of indexmap so the crate is already in the dependencies anyway. We might as well benefit from some performance improvements like the one from foldhash

IvanIsCoding avatar Oct 14 '24 17:10 IvanIsCoding

Maybe there have been additional features in hashbrown that make it worth using now but I'm curious is the difference is measurable.

The key reasons we use hashbrown on rustworkx is the default hasher is higher performance, it actually is surprising how much a difference using ahash made here when I did the benchmarking for it many years ago (I hadn't heard of foldhash before this PR though so I can't comment on that). We could have gotten this by directly using ahash on the stdlib hashmap though (or ahash's mapping type). The other key reason though was rayon support, we don't use it extensively but if we need a parallel iterator over a hashmap or hashset hashbrown has a rayon feature we can use.

mtreinish avatar Oct 14 '24 17:10 mtreinish

it actually is surprising how much a difference using ahash made here when I did the benchmarking for it many years ago

Yeah I was curious if the difference is still there, now that the stdlib uses the same implementation. But the transitive dependency due to indexmap makes it a moot point!

jamestwebber avatar Oct 14 '24 17:10 jamestwebber

I am closing this as indexmap is now pinned to a version with hashbrown that still uses ahash

IvanIsCoding avatar Jan 14 '25 18:01 IvanIsCoding