dice icon indicating copy to clipboard operation
dice copied to clipboard

Add support for Bloom Filters

Open arpitbbhayani opened this issue 3 years ago • 8 comments

Is your feature request related to a problem? Please describe. Bloom filters are essential to check for existence, and having support for that is essential for real-world usecases.

Describe the solution you'd like Add commands BINIT, BADD, and BEXIST to initialize, add, and check existence within the bloom filter. BINIT can take in arguments around tolerance and initial allocation.

Some points to consider

  • the size of the bloom filter
  • seamless resize

https://scholar.google.com/scholar?hl=en&as_sdt=0%2C5&q=bloom+filter&btnG=

arpitbbhayani avatar Oct 18 '22 12:10 arpitbbhayani

interested to take a crack at this. and have a few queries here.

  • curious to understand what hash functions are we thinking to use since it needs to be uniformly distributed and independent, would murmur be a gtg choice for the current use case?
  • what does seamless resize infer, is it related to extending and decreasing the bloom filter size dynamically post BINIT?

rak3n avatar Oct 19 '22 05:10 rak3n

Hey @rak3n I am looking into this at the moment and have started working on this. There are still some ambiguities in the implementation and hence I am going through other open source Bloom implementations to converge.

arpitbbhayani avatar Oct 19 '22 17:10 arpitbbhayani

Hi, just stumbled upon this. I'd also interested in contributing to this issue if possible. Just wanted to share a reference implementation which go-ethereum uses. Majority of code lies here. Happy to help with any benchmarks if required :)

manav2401 avatar Oct 21 '22 05:10 manav2401

@manav2401 Thanks for sharing the reference, but we cannot just take this code (though open source). Also, there is a need to draft a solid specification around this.

Can you prepare a quick document on what you envision from a bloom filter and the feature? We will see what fits into the scheme and then work on the implementation.

arpitbbhayani avatar Oct 25 '22 06:10 arpitbbhayani

@arpitbbhayani sorry was away last week so couldn't respond.

Sure, let gather some more context on the same and prepare a document.

manav2401 avatar Nov 02 '22 12:11 manav2401

Hey @arpitbbhayani have jotted down few points in this document. Let me you what you think about it and if there's anything specific to be added/removed from it.

https://manavdarji.notion.site/Bloom-filters-in-DiceDB-6da85a9d2dba41b5afd94f34a74463f8

manav2401 avatar Nov 02 '22 15:11 manav2401

@manav2401 this is great. Are you available at 11pm tonight? We can hop on a call and close this.

Discord is where we meet. Just drop me a message.

If the time does not work, drop me a few suitable time and we can sync then.

arpitbbhayani avatar Nov 02 '22 16:11 arpitbbhayani

@arpitbbhayani 11 pm tonight (IST) works for me. Dropped a message on twitter.

manav2401 avatar Nov 02 '22 16:11 manav2401

This was completed with #99

JyotinderSingh avatar Jul 31 '24 13:07 JyotinderSingh