PolyFuzz icon indicating copy to clipboard operation
PolyFuzz copied to clipboard

Update _tfidf.py to allow for sentence-level TFIDF ngram-matching

Open DGaffney opened this issue 3 years ago • 2 comments

I need a sentence-level ngram option since I'm checking on similarities between short texts. Maybe this option is useful for others!

DGaffney avatar Sep 10 '22 23:09 DGaffney

Apologies for the late reply! I have to look into this a bit further as this also could be resolved by simply keeping the whitespaces or it might even make sense to create a different back-end that is optimized for sentence-level matching.

MaartenGr avatar Oct 18 '22 05:10 MaartenGr

sorry for my late reply - this is definitely not optimized yes - just wanted to start the conversation helpfully rather than just demanding you build it :)

DGaffney avatar Nov 02 '22 00:11 DGaffney