pyPdf icon indicating copy to clipboard operation
pyPdf copied to clipboard

problem in NameObject.readFromStream when stream.read(1) does not advance

Open ccurvey opened this issue 15 years ago • 1 comments

I'm in way over my head here...kind of feel like the blind pig that found an acorn. Anyway, I'm trying to process a PDF that contains the following items:

10 0 obj /DeviceGray endobj

The problem is that when the line "/DeviceGray" is read, tok = stream.read(1) does not seem to advance the file pointer. (I checked by looking at the value of stream.tell() before and after the stream.read())

I don't know why the pointer does not get advanced, but making the code look like this fixes the problem, and things seem to move along just fine.

    while True:
        pre_read = stream.tell() # new
        tok = stream.read(1)
        if tok.isspace() or tok in NameObject.delimiterCharacters or stream.tell() == pre_read:
            stream.seek(-1, 1)
            break
        name += tok
    return NameObject(name)

I can provide a copy of the PDF to someone if they want an example. (Note to self: this is 98421_SupLegal 2008-02 Stmt_p83_r8.pdf)

ccurvey avatar Sep 17 '10 18:09 ccurvey

woops...I just noticed that in the regular post, the PDF formatting got messed up. The "endobj" is on the next line from "/DeviceGray" (at least in vim with the PDF plugin). That might explain the problem.

ccurvey avatar Sep 17 '10 18:09 ccurvey