Integer key and value
It seems mph essentially implements a map[[]byte][]byte. Is it possible to change the code such that it becomes map[uint32][]uint8? Go's bulit-in map has just too much overhead for my application and I was wondering if this can help.
Do you mean slice overhead? It's not ideal it's true. It might be possible to use a single implementation to support both. I'll have a ponder.
I meant the map overhead. But right, there is the slice overhead as well, although it is not important in my case.
Along these lines, it makes sense to change the definition of CHDBuilder to
type CHDBuilder struct {
keys []KeyType
values []ValueType
}
to save space. In fact, doing away with Entry completely might the best direction to go.
As for hashing, one can use unsafe (base pointer) and reflect (for the object size) to do the hashing regardless of the definition of ValueType (something that would utilize the underlying hash function with signature hash(ptr *unsafe.Pointer, size int) uint64).
Also, I noticed that each call to chdHash creates a new fnv64 hasher (which allocates). Is this really the correct behavior?
In fact, I checked the fnv source and it's so simple that writing your inline implementation fnv64([]byte) uint64 with no allocations should be trivial, without any need for the New64a, Write, Sum64 ceremony.
I'm definitely amenable to performance improvements. Changing the definition of CHDBuilder is a good idea.
First step might be some benchmarks.
Okay, I've implemented some of these changes and we're down from:
BenchmarkCHD 10000000 180 ns/op
to:
BenchmarkCHD 20000000 132 ns/op
I wanted this for ints too. I didn't see a clean way to integrate it into this package without completely cluttering the source code, so I made a fork: https://github.com/Jille/uint64mph