35 Added the support for FILTERKEYS

Open MayukhSobo opened this issue 3 years ago • 1 comments

Fixes #35

It seems I have added too much but there is a reason why something so trivial should take this much time. Apologies for such an extreme level of changes and please allow me to explain it.

FILTERKEYS is supposed to work on multiple keys and keeping in mind that we are not using any special data structure to store the keys in any ORDERED manner and the fact the regex can be complete random, the best time complexity that we can achieve is O(n) via iterating over all the keys.
This leaves me with no choice but either implement the FILTERKEYS trivially or use sheer brute force of concurrency to optimise the performance.
I choose the second option. This also goes with the thought that @arpitbbhayani had to make individual commands as concurrent but not the multiple command just to keep it simple for now.
With the introduction of go routine worker pool, I had to introduce synchronisation issues inevitably. To fight this there are two ways. Either I use normal golang maps with Mutex or use sync.Map from go1.9.
Go has a weird property. The map is not thread safe. Even if multiple goroutines are working on individual keys. It is not even thread safe if we are reading on keys A and B from one goroutine and writing on C and D from another goroutines. That's weird. Maps are only safe iff we initialise the map before all the go routine starts and they don't change before reading ends.
Here this is not the case because we are also deleting the expired keys while preforming read keys for example. This means the read in this scenario is also not safe.
Implementing mutex along with map is good but cumbersome. People usually use sync.Mutex but there is also sync.RWMutex. Nevertheless, performance of map with mutex doesn't scale well in my experience and has a lot of implementation blockers.
sync.Map was introduced keeping in mind this issue in go 1.9. It scales EXCEPTIONALLY better if the vertical scaling of the system is considerable.
sync.Map has a problem. We can't use len function on it. So I had to create my own custom Storage(DiceKVStore) with an atomic count in it.
Because I had to introduce sync.Map and goroutine pool (can explain why not simple go routine if needed), I was mistakenly introducing a lot of cyclic dependency in project. In simpler word, package A is importing package B and package B is importing package A. This reminded me that the current design is very tightly coupled. So I had to introduce a lot of packages.

I know I have introduced a lot of changes and I also know huge changes in a single PR is a bad practise but I couldn't stay away from optimising it. It took me a lot of time to rewire the whole thing. I would request the reviewers to review the code and give pointers. If you guys decide to merge this, I shall be ready to work along with other contributors to adopt change that happened in between. If not, you can put it in a separate experimental branch or else worst option, I can continue it on my fork.

Thanks Mayukh

Nov 02 '22 21:11 MayukhSobo

@MayukhSobo Thanks for the contribution. The code base has moved forward lot. Please rebase and re-submit if you are still keen to fix this. We will be cleaning up old PR's if they are inactive.

Jul 14 '24 10:07 yashs360