rpds icon indicating copy to clipboard operation
rpds copied to clipboard

Implement __hash__

Open Julian opened this issue 2 years ago • 3 comments

  • [x] Queue
  • [ ] HashTrieMap
  • [ ] HashTrieSet
  • [ ] List

Julian avatar Jan 25 '24 14:01 Julian

I'm not familiar enough with Rust to provide a PR that implements __hash__ for List but I would really appreciate such an addition. :)

qerub avatar Apr 24 '24 08:04 qerub

Hey.

I think List should be relatively straightforward and basically match what I did for Queue (as opposed to the other 2 which require a different approach) -- but I haven't made time to do it, and especially so now that there's some work to figure out how to upgrade to PyO3 0.21...

But I can try to put it on the TODO list.

Can I ask out of curiosity what you're using rpds.py for that led you to need it?

Julian avatar Apr 24 '24 08:04 Julian

Sure! I want an immutable list implementation in an application I'm developing and since rpds-py is already a (currently transitive) dependency via jsonschema it's appealing to use instead of adding yet another dependency like pyrsistent.

qerub avatar Apr 24 '24 09:04 qerub

I can open a PR implementing __hash__ for list!

I'm not sure how to implement hash for the HashTrieMap and HashTrieSet since sequence matters when building a hash from a hasher. The simplest on top of my mind was sort the keys (items) in the map (set) based on their hashes. For example, for map, it could be

hashes = []
FOR key, value IN map:
    hash_tuple = (python_hash(key), python_hash(value))
    
SORT hashes

FOR k_hash, v_hash IN hashes:
    hasher.write_i128( (k_hash << 64) | v_hash )

RETURN hasher.finish()

I'm not a cryptography expert so I'm not sure if this is a good idea. :(

FlickerSoul avatar Jul 07 '24 23:07 FlickerSoul

I just wrote something for __hash__ here. I'm curious what you think :)

Thanks!

FlickerSoul avatar Jul 08 '24 00:07 FlickerSoul

Hey! Help here would also be great.

For the unordered collections I think the best idea so far is we port the CPython hashing code which is used for frozenset -- it can be found here: https://github.com/python/cpython/blob/d69529d31ccd1510843cfac1ab53bb8cb027541f/Objects/setobject.c#L715

Julian avatar Jul 08 '24 13:07 Julian

Thanks for the direction! I'll take a look :)

FlickerSoul avatar Jul 08 '24 13:07 FlickerSoul

Thanks to @FlickerSoul this should be done. Please give it a try!

Julian avatar Aug 06 '24 16:08 Julian