PDFSharp.Extensions icon indicating copy to clipboard operation
PDFSharp.Extensions copied to clipboard

FlateDecode DeviceRGB Image not extracting properly

Open nsideras opened this issue 10 years ago • 5 comments

I have a PDF containing an image with FlateDecode encoding and DeviceRGB color space that does not extract as expected. With my limited understanding, I believe the color palette is not being read properly.

<< /Type /XObject
/Subtype /Image
/Height 1800
/Width 1200
/BitsPerComponent 8
/Length 128257
/Filter /FlateDecode
/DecodeParms << /Predictor 15
/Colors 3
/BitsPerComponent 8
/Columns 1200
>>
/ColorSpace /DeviceRGB
>>

I understand that not all images can currently be extracted, but I believe this image meets the properties of generally extractable images.

98ccc0ea322f4e80aa38cf353de2cd6e[1].pdf 00000001-001

nsideras avatar Dec 08 '15 18:12 nsideras

Try to rotate image before extracting

vanderkorn avatar Dec 09 '15 04:12 vanderkorn

I'm using the example code from README.md. Rotating the image object using the Image.RotateFlip before saving simply exports the same image rotated, with the same glitches.

nsideras avatar Dec 09 '15 19:12 nsideras

I am having the same issue.The only difference is the size of my image, everything else is identical. The result is a similar looking distortion to the image. Is there any progress on this?

kevbro02 avatar May 21 '16 02:05 kevbro02

Well, as soon as I posted that comment, I ran across this page. https://forums.adobe.com/thread/664902 It think this might be the insight I am missing. Apparently there is another step after inflating using the PNG predictor. I dont know if this step is already coded in PdfSharp or not.

kevbro02 avatar May 21 '16 02:05 kevbro02

After implementing all the PNG filter variation logic, I was able to get your image (and mine too). I first get the UnfilterValue from the stream. I created an array big enough to convert it to a 32bit image (this is probably not necessary but seemed easiest at the time), stepped through each line, got the PNG filter value from the first byte and filled in the new byte array based on what filter was applied to that line. That image uses every filter except for 3 (average)

image

kevbro02 avatar May 22 '16 13:05 kevbro02