Michael Axiak

Results 15 comments of Michael Axiak

I see this issue as well, but I have no idea how to fix it :(

Try running with more memory

I would expect option (2) to be faster, not slower. We should be within the same cache and there isn't any real I/O. Still, I can't argue with the numbers...

If you want to earn brownie points -- try writing a Cython version that calls out to the C test & add and see what the performance improvement is :)

This is done because hashing strings is separate from hashing unicode strings for performance. I'm not sure how to fix this problem since I don't know what encoding the strings...

After considering the technical challenges, I think I'm going to table this and put a note that you should prefer to put in ascii strings and encode what you insert.

I can write up some more documentation if you care about it :)

I don't have a paper, but it's inspired by fulltext search. In some circles you might see this called trigram or shingle indexing. @Glench wrote a wonderful intuitive description of...

Eh. I meant here: http://glench.github.io/fuzzyset.js/

We had an issue with \W, \D and \S that itsadok just fixed and I pushed out. However, I think there are still unicode issues as the groups in issue...