hashbrown icon indicating copy to clipboard operation
hashbrown copied to clipboard

Implement iterator which starts at arbitrary location

Open alfa07 opened this issue 2 years ago • 6 comments

This PR implements iterator which starts at arbitrary location in the table. There are several motivations for this change:

  • For security reasons one may wish to start iterations at arbitrary locations in the table. That what golang is doing
  • For implementing efficient pseudo-random sampling from the table (e.g for use in probabilistic algorithms)
  • For being able to port golang code with 100% semantic preservation

See this issue for a real-world motivating example.

That is just POC, if the idea in principle is acceptable I would be happy to shape it with some guidance from maintainers.

alfa07 avatar Apr 12 '23 19:04 alfa07

it does not work properly, that's my testing code

    fn print_map_random(m: &HashMap<&str, &str>,times: usize) {
        for _ in 0..times {
            for (k, v) in m.iter_at(0) {
                println!("{k}-{v}");
            }
            println!("=========")
        }
    }

    #[test]
    fn test_map() {
        let mut m = HashMap::new();
        m.insert("k1", "v1");
        m.insert("k2", "v2");
        m.insert("k3", "v3");
        m.insert("k4", "v4");
        m.insert("k5", "v5");
        m.insert("k6", "v6");
        print_map_random(&m, 10);
    }

phantomhker avatar Apr 18 '23 10:04 phantomhker

for (k, v) in m.iter_at(0) {

You need to randomize the hint to start at random offsets of the table.

the8472 avatar Apr 22 '23 16:04 the8472

This feels awkward for similar reasons as #382: you really want to be using IndexMap instead if you are doing any kind of operation that depends on the ordering/distribution of the keys.

Amanieu avatar Apr 24 '23 00:04 Amanieu

@Amanieu Using IndexMap is not the right choice for this use case. We want to access elements of the HashMap with O(1) per access in pseudo random order. Hashing of keys in the HashMap provides this order for free. IndexMap preserves insertion order which we don't want here as it is not pseudo random. Also IndexMap is plainly more expensive.

alfa07 avatar Apr 28 '23 16:04 alfa07

We have similar(same?) use case. We store view state (cache like) in Hashbrown map. Sometimes we want to pick any random value out of map and do some validation on its current state.

dzmitry-lahoda avatar Jun 25 '24 15:06 dzmitry-lahoda