fuzzymatcher icon indicating copy to clipboard operation
fuzzymatcher copied to clipboard

New Zero Division Error Issue

Open StephenCranney opened this issue 5 years ago • 2 comments

Several months ago I was often using Fuzzymatcher and never ran into a problem. However, now when I (and a colleague on a completely different OS) try to fuzzymatch two data frames I'm getting a ZeroDivisionError (below). I've read through the responses to a similar problem a few years ago (issue 42 below), and they suggest incorporating a "except (ValueError, ZeroDivisionError)" into line 44 of the tokencomparison.py file. The problem is that file already has the ZeroDivisionError built in, which suggests that this ZeroDivisionError is a new one with a different provenance. Any suggestions for how to fix this without breaking the package would be helpful.

fuzzymatched=fuzzymatcher.fuzzy_left_join(dataframe1, dataframe2, 'address', 'address') Traceback (most recent call last):

File "", line 1, in fuzzymatched=fuzzymatcher.fuzzy_left_join(dataframe1, dataframe2, 'address', 'address')

File "/Users/stephencranney/opt/anaconda3/lib/python3.7/site-packages/fuzzymatcher/init.py", line 41, in fuzzy_left_join m.match_all()

File "/Users/stephencranney/opt/anaconda3/lib/python3.7/site-packages/fuzzymatcher/matcher.py", line 92, in match_all self.link_table = self._match_processed_data()

File "/Users/stephencranney/opt/anaconda3/lib/python3.7/site-packages/fuzzymatcher/matcher.py", line 136, in _match_processed_data this_record.find_and_score_potential_matches()

File "/Users/stephencranney/opt/anaconda3/lib/python3.7/site-packages/fuzzymatcher/record.py", line 76, in find_and_score_potential_matches self.matcher.data_getter.get_potential_match_ids_from_record(self)

File "/Users/stephencranney/opt/anaconda3/lib/python3.7/site-packages/fuzzymatcher/data_getter_sqlite.py", line 93, in get_potential_match_ids_from_record self._search_specific_to_general_single(token_list, rec_left)

File "/Users/stephencranney/opt/anaconda3/lib/python3.7/site-packages/fuzzymatcher/data_getter_sqlite.py", line 119, in _search_specific_to_general_single self._add_matches_to_potential_matches(new_matches, rec_left)

File "/Users/stephencranney/opt/anaconda3/lib/python3.7/site-packages/fuzzymatcher/data_getter_sqlite.py", line 167, in _add_matches_to_potential_matches scored_potential_match = self.matcher.scorer.score_match(rec_left.record_id, right_id)

File "/Users/stephencranney/opt/anaconda3/lib/python3.7/site-packages/fuzzymatcher/scorer_default.py", line 57, in score_match p = self._field_to_prob(f_left, record_left, record_right)

File "/Users/stephencranney/opt/anaconda3/lib/python3.7/site-packages/fuzzymatcher/scorer_default.py", line 78, in _field_to_prob prob_unmatching2 = self._get_prob_unmatching(unmatching_tokens_right, tokens_left, field_right, field_left)

File "/Users/stephencranney/opt/anaconda3/lib/python3.7/site-packages/fuzzymatcher/scorer_default.py", line 107, in _get_prob_unmatching return 1/prob

ZeroDivisionError: float division by zero

StephenCranney avatar Jul 10 '20 15:07 StephenCranney

After an initially promising start with this package, as I widened my test data I quickly ran into the same bug.

Unfortunately I'm under time pressure so I'll have to move on and try other packages, but I'm also keen to hear of any possible fix.

Mike-Honey avatar Sep 12 '20 02:09 Mike-Honey

Ran into this error today. Any options how to fix it?

sleeprock avatar Nov 16 '22 18:11 sleeprock