oak icon indicating copy to clipboard operation
oak copied to clipboard

Change LookupData entries from a hash map to a B-tree map

Open mariaschett opened this issue 4 years ago • 7 comments

Changing LookupData entries from a hash map to a B-tree map as a preceding step to #2202 to ensure no regressions.

mariaschett avatar Oct 11 '21 10:10 mariaschett

BTW we currently do have some benchmarks, but they do not exercise the lookup API particularly well. I wonder whether we should have a benchmark with a lot more (e.g. 1M) ​k/v entries (programmatically generated, possibly via https://github.com/project-oak/oak/tree/main/oak_functions/lookup_data_generator used as a library) ​to ensure we are not introducing regressions with this change.

tiziano88 avatar Oct 11 '21 10:10 tiziano88

There is example code for using the lookup data generator as part of a benchmark at https://github.com/project-oak/oak/blob/main/oak_functions/examples/weather_lookup/module/src/tests.rs#L197-L200

conradgrobler avatar Oct 11 '21 10:10 conradgrobler

Ah good point, I forgot about that! Interesting to note that the benchmark is expected to take the same amount of time (20ms) for both 10k and 200k entries, so presumably what slows it down is really the data generation? Or does each iteration actually take substantially longer in the 200k case, and 20ms is just an upper bound?

tiziano88 avatar Oct 11 '21 10:10 tiziano88

Most of the time for that example is used to do the S2 geometry-related processing, so it is not very good at benchmarking just the lookup performance. When it previously did a linear scan through all the lookup data the two dfferent sizes had significantly different cutoffs.

conradgrobler avatar Oct 11 '21 10:10 conradgrobler

Yeah I wonder whether we should move that to a benchmark of the weather_lookup example, and have a simpler benchmark for just straight k/v lookup (that may stay in the test file for the loader, since it's pretty basic functionality, but also alternatively may be moved to the k/v lookup example if the author prefers).

tiziano88 avatar Oct 11 '21 12:10 tiziano88

Is this still relevant?

mariaschett avatar Sep 30 '22 10:09 mariaschett

I think that having this as an option for lookup data would still be useful, but I don't think this is a high priority at the moment.

conradgrobler avatar Sep 30 '22 11:09 conradgrobler

I don't think this is relevant at the moment. I will close this for now and we can create a new issue if we have a specific need for this in future.

conradgrobler avatar Nov 25 '22 13:11 conradgrobler