string_grouper icon indicating copy to clipboard operation
string_grouper copied to clipboard

how to handle 'ValueError: empty vocabulary; perhaps the documents only contain stop words' in group_similar_strings

Open gw00207 opened this issue 4 years ago • 2 comments

currently I am having to use a try/except clause when using group_similar_strings in case all of the strings only contain stopwords. Is it possible to handle this case differently, e.g. just return all strings ungrouped? or perhaps just a more descriptive error so that I can except and handle OnlyStopwordsError or similar instead of any ValueError. great package, many thanks.

gw00207 avatar Sep 01 '21 09:09 gw00207

That makes sense, @gw00207 and is a simple enough addition to make. Can you create a pull request for this?

ParticularMiner avatar Sep 01 '21 10:09 ParticularMiner

please see https://github.com/Bergvca/string_grouper/pull/67

gw00207 avatar Sep 01 '21 13:09 gw00207