Max Bachmann

Results 81 issues of Max Bachmann

This is essentially reopening issue #39, since the introduced fix does not solve the problem, but just makes it work for this explicit example. E.g. ``` FuzzySearch.partialRatio("no", "bnonco"); ``` should...

add a pre release test, which tests whether the submodules `extern/rapidfuzz-cpp` and `extern/jarowinkler-cpp` are using the newest tag available.

enhancement
good first issue

The pure Python implementation still misses the following parts: - [ ] Levenshtein.editops - [ ] Levenshtein.opcodes - [ ] Indel.editops - [ ] Indel.opcodes - [ ] LCSseq.editops -...

enhancement

Currently the process module has the following functions: | function | kind | explanation | |------------|--------|-----------------| | extractOne | one x many | returns the best match as (choice, score,...

enhancement

A banded version of the Levenshtein distance algorithm should be implemented as described in https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.142.1245&rep=rep1&type=pdf. This would reduce the runtime of `string_metric.levenshtein` from `O(N/64 * M)` to `O(score_cutoff/64 * M`...

performance

Since some of the processor functions can run for a long time. However it is currently not possible to quit by pressing Ctrl+C. Instead it is required to manually kill...

enhancement

All the algorithms in the process module should be fairly simple to run in parallel.

performance

The [Smith Waterman algorithm](https://en.wikipedia.org/wiki/Smith%E2%80%93Waterman_algorithm) is a commonly used metric to compare strings. It would be useful to add it to RapidFuzz.

enhancement

Currently there is only a functions editops/opcodes, which returns one possible optimal alignment. However there can be more than one optimal alignment. It would make sense to add the possibility...

enhancement