beviah
beviah
Hmm, seems these authors have used VecMap to align subwords: https://github.com/GeorgeVern/smala
I don't know C, so would be hard for me to implement it. I need approximate counters for truly large number of events, billions.. which yields 32bit hashes unusable..
But it seems that even hundreds of thousands represent an issue for 32bit hash: https://preshing.com/images/small-probabilities.png There is not much cost to collision as long as the counter does not get...
Maybe I wasn't clear enough, I don't mean billions of events, but billions of unique events :) Hundreds of millions at least..
As for the motivation, wanted to compare performance with this library: https://github.com/mikrosk/py-probstructs
I may have interpreted that table in wrong way... seems in real time I will be getting collisions every few seconds which is not terrible.. and that this rate does...
rocksdict 0.3.24 Python 3.12.3 Ubuntu 24.04.1 LTS
I managed to get manual compaction working ``` def speedb_options(): opt = Options() opt.create_if_missing(True) opt.create_missing_column_families(True) opt.set_max_open_files(-1) # You don't have this set opt.set_max_background_jobs(4) opt.set_max_compaction_bytes(512 * 1024 * 1024) opt.set_max_subcompactions(4) opt.set_compaction_style(DBCompactionStyle.universal())...
@ashvardanian I think this bug may be related to #514 which occurs inconsistently, and regardless of the index size! Some indexes just take hours to load. Probably some mapping issues...