mph icon indicating copy to clipboard operation
mph copied to clipboard

Integer key and value

Open raichu opened this issue 11 years ago • 7 comments

It seems mph essentially implements a map[[]byte][]byte. Is it possible to change the code such that it becomes map[uint32][]uint8? Go's bulit-in map has just too much overhead for my application and I was wondering if this can help.

raichu avatar Mar 19 '14 02:03 raichu

Do you mean slice overhead? It's not ideal it's true. It might be possible to use a single implementation to support both. I'll have a ponder.

alecthomas avatar Mar 21 '14 02:03 alecthomas

I meant the map overhead. But right, there is the slice overhead as well, although it is not important in my case.

raichu avatar Apr 02 '14 04:04 raichu

Along these lines, it makes sense to change the definition of CHDBuilder to

type CHDBuilder struct {
    keys []KeyType
    values []ValueType
}

to save space. In fact, doing away with Entry completely might the best direction to go.

As for hashing, one can use unsafe (base pointer) and reflect (for the object size) to do the hashing regardless of the definition of ValueType (something that would utilize the underlying hash function with signature hash(ptr *unsafe.Pointer, size int) uint64).

raichu avatar Apr 08 '14 06:04 raichu

Also, I noticed that each call to chdHash creates a new fnv64 hasher (which allocates). Is this really the correct behavior?

In fact, I checked the fnv source and it's so simple that writing your inline implementation fnv64([]byte) uint64 with no allocations should be trivial, without any need for the New64a, Write, Sum64 ceremony.

raichu avatar Apr 08 '14 06:04 raichu

I'm definitely amenable to performance improvements. Changing the definition of CHDBuilder is a good idea.

First step might be some benchmarks.

alecthomas avatar Apr 08 '14 11:04 alecthomas

Okay, I've implemented some of these changes and we're down from:

BenchmarkCHD    10000000           180 ns/op

to:

BenchmarkCHD    20000000           132 ns/op

alecthomas avatar Apr 22 '14 23:04 alecthomas

I wanted this for ints too. I didn't see a clean way to integrate it into this package without completely cluttering the source code, so I made a fork: https://github.com/Jille/uint64mph

Jille avatar Sep 17 '23 09:09 Jille