PalDB icon indicating copy to clipboard operation
PalDB copied to clipboard

Why is the benchmark incomplete?

Open dlangdon opened this issue 10 years ago • 2 comments

I found it curious that you compare with 2 alternatives for performance and with java hashsets for memory usage. It would be nice to compare both performance and memory with the same 3 alternatives, else it seems you are conveniently selecting data instead of showing the right tradeoff.

dlangdon avatar Nov 10 '15 18:11 dlangdon

I had a similar desire to see more details in the benchmark, and have a couple questions. If Pal is a key value store, why compare memory usage with a hashSet? Wouldn't a HashMap be a more appropriate comparison? In reading the code (TestMemoryUsageHashMap.java) it actually is a HashSet, in spite of the class name. Maybe change the name or the Collection type?

Also, I found it pretty surprising that you could put 100M integers in a hashSet, and only use 500MB of memory. That's 5 bytes per Integer, which is considerably less than the space required an Integer, even without the HashSet overhead. (Again, reading the code provided the answer. The test is for 10 M Integers, not 100 M).

Also, it's hard to judge the read performance without knowing what value was stored with each key. Obviously bigger values would tend to be slower. Apparently the value is a boolean, maybe add that to the README.md?

lwhite1 avatar Dec 10 '15 14:12 lwhite1

Well, give me the same 512 MB of memory I can take an infinite number of integers and build you a set with them ;-)

dlangdon avatar Dec 10 '15 15:12 dlangdon