doc2text icon indicating copy to clipboard operation
doc2text copied to clipboard

Python 3.5 compatibility

Open andjelx opened this issue 8 years ago • 6 comments

Seems library not 100% python3 compatible. When I'm tying to run simple code:

import doc2text

doc = doc2text.Document()
doc = doc2text.Document(lang="eng")
doc.read('pdf-sample.pdf')

I'm getting

Traceback (most recent call last):
  File "doc2text_test.py", line 13, in <module>
    doc.read('pdf-sample.pdf')
  File "/usr/local/lib/python3.5/dist-packages/doc2text/__init__.py", line 44, in read
    for i in xrange(self.num_pages):
NameError: name 'xrange' is not defined

andjelx avatar Feb 01 '17 07:02 andjelx

Pull request #25 created

andjelx avatar Feb 01 '17 07:02 andjelx

Need to change the code in file init.py Line 44: ` for i in xrange(self.num_pages): '

to

for i in list(range(self.num_pages)):

neel17 avatar Dec 19 '18 09:12 neel17

@neel17 xrange returns itterator, not list - which is more optimal in terms of mem usage.

andjelx avatar Dec 19 '18 09:12 andjelx

@andjelx xrange is not supported in Python3, what could be the probable work around?

neel17 avatar Dec 19 '18 10:12 neel17

@neel17 Have u ever checked the PR?

andjelx avatar Dec 19 '18 10:12 andjelx

@andjelx : My regret, your PR works fine, I was trying to solve it and got this issue resolve using the above solution as well. Thanks for making me re-understand it again!

neel17 avatar Dec 19 '18 10:12 neel17