pthash icon indicating copy to clipboard operation
pthash copied to clipboard

Further CPU and GPU impementation alingment

Open stefanfred opened this issue 1 year ago • 1 comments

Aligned calculation of bucket count per partition to GPU Removed bucketer from space consumption (in my opinion we do not need to count anything towards space consumption that does not depend on the input keys themselves.)

GPU and CPU implementation now have same (max diff 0.004 bits per key) space consumption for same configuration

stefanfred avatar Apr 07 '24 14:04 stefanfred

Hi @stefanfred, the size on disk should match that returned by num_bits(), i.e., everything that we need to evaluate the function. So you cannot remove the space for the meta data, although small. I think you should add this space to the GPU implementation rather than removing it from the CPU implementation.

-Giulio

jermp avatar Apr 07 '24 19:04 jermp

Hey Giulio,

ok. I will add the bucketer space in the GPU implementation. However, we still need to use the same formula for bucket count per partition like on the GPU. I can not change that on the GPU side, because it does not allow a dependence between the bucket count per partition and the number of input keys. (would require shader recompilation for each MPHF instance)

stefanfred avatar Apr 08 '24 07:04 stefanfred

Hi @stefanfred, perfect, thanks!

A little nitpick: why the renaming from Minimal to NeedsFreeArray? A function by itself does not "need" anything, rather we choose to have it minimal or not. So I think the parameter should be named Minimal as it was before.

jermp avatar Apr 08 '24 10:04 jermp

My thought process was that minimal does not necessarily mean that the free array is used. E.g. for alpha=1 and additive hashing. I think calling it minimal would be a bit confusing if the PHF can be minimal without it.

stefanfred avatar Apr 08 '24 11:04 stefanfred

I have now added multithreaded encoding. Actually was a bottleneck now on the large machine :)

stefanfred avatar Apr 08 '24 11:04 stefanfred