docspell icon indicating copy to clipboard operation
docspell copied to clipboard

Multi-page PDFs

Open ghost opened this issue 3 years ago • 3 comments

Is it possible to re-arrange, extract, split, delete individual pages in a multipage PDF? I'd like to scan/process a multipage PDF then split out individual pages if desired as its own item.

ghost avatar Mar 07 '22 23:03 ghost

This is not possible in Docspell. I think papermerge can do these things, perhaps it's worth a look?

eikek avatar Mar 08 '22 21:03 eikek

@CDarwin7 Not exactly what you are looking for but I solved it with a little import wrapper. I have a systemd-timer running which is calling an import script every couple minutes. When it finds a new PDF it hands it over to pdfcpu and this is splitting my document into single pages. Then I am moving them into the import folder of docspell. Later in docspell I am merging them together by just adding another document to the conversation.

I am mainly using this because my scanning software struggles sometimes to remove blank pages.

crane-denny avatar Jan 04 '23 13:01 crane-denny

I personally don't think files should ever be edited in a destructive way. If we allow this, we can only do so if the version history is kept.

Blank pages don't hurt. But rotate pages surely do affect OCR, so being able to rotate pages would be great. It's probably possible to save rotation information from the UI to the database without changing the original, and using that information when rendering, and when running OCR…

madduck avatar Sep 08 '23 21:09 madduck