simhash icon indicating copy to clipboard operation
simhash copied to clipboard

OverflowError: Python integer 346 out of bounds for uint8

Open lalit97 opened this issue 7 months ago • 0 comments

Getting OverflowError error for Python 3.12 on Ubuntu when trying to calculate Simash for a few domains, for example wikipedia.org

simhash==2.1.2
numpy==2.3.0
Traceback (most recent call last):
  File "grouping.py", line 115, in fingerprint
    Simhash(html)
                         ^^^^^^^^^^^^^
  File "env/lib/python3.12/site-packages/simhash/__init__.py", line 79, in __init__
    self.build_by_text(unicode(value))
  File "env/lib/python3.12/site-packages/simhash/__init__.py", line 107, in build_by_text
    return self.build_by_features(features)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "env/lib/python3.12/site-packages/simhash/__init__.py", line 136, in build_by_features
    sums.append(self._bitarray_from_bytes(h) * w)
                ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~
OverflowError: Python integer 609 out of bounds for uint8

lalit97 avatar Jun 18 '25 05:06 lalit97