Merge/concatenate all pillow images to one single image
Hi Edouard,
Is there a way where we can merge the PIL images to one single image for multipage pdf ?
I see, in parsers.py function parse_buffer_to_jpeg, we convert the hex to PIL image array using Image.open( .. ) by splitting on \xff\d9 which is great but when I printed the buffer I saw the following -
Any jpeg file buffer hex starts with \xff\xd8 (which marks the start) and \x\ff\d9 (which marks the end). But when there is a multipage pdf, we do have something like this in middle -\x\ff\d9\xff\xd8 (combination of ending of page1 hex with starting of page2 hex). I tried removing it and replaced it with some random hex code but It didn't work. The Pillow image is created but when i try to save it, it fails. Am I missing something here ?
Can we have a solution/workaround for this as we need this ASAP?
The reason for this ask is because, we want to concatenate all the images to one single image and pass it to another system (which is consuming a lot of memory in RAM)
Merge images to one single image logic if you're interested to debug why it is consuming a lot of memory!
def merge(pil_images):
widths, heights = zip(*(i.size for i in pil_images))
total_width = max(widths)
max_height = sum(heights) + (len(pil_images) -1)* 10
merged_pil_image = Image.new('RGB', (total_width, max_height))
x_offset = 0
for im in pil_images:
merged_pil_image.paste(im, (0, x_offset))
x_offset += im.height +10
return merged_pil_image
@Belval ^^
Attaching performance profiler -
There were 2 pages in pdf. So each image conversion took 350MB :'(
Line # Mem usage Increment Occurrences Line Contents
=============================================================
272 675.0 MiB 675.0 MiB 1 @profile
273 def merge_pil_images(pil_images: Union[PIL.Image.Image, List[PIL.Image.Image]]) -> PIL.Image.Image:
274 675.0 MiB 0.0 MiB 9 widths, heights = zip(*(i.size for i in pil_images))
275
276 675.0 MiB 0.0 MiB 1 total_width = max(widths)
277 675.0 MiB 0.0 MiB 1 max_height = sum(heights) + (len(pil_images) -1)* 10
278
279 1016.9 MiB 341.9 MiB 1 merged_pil_image = PIL.Image.new('RGB', (total_width, max_height))
280 1016.9 MiB 0.0 MiB 1 x_offset = 0
281 1358.6 MiB 0.0 MiB 4 for im in pil_images:
282 1358.6 MiB 341.7 MiB 3 merged_pil_image.paste(im, (0, x_offset))
283 1358.6 MiB 0.0 MiB 3 x_offset += im.height +10
284 1358.6 MiB 0.0 MiB 1 return merged_pil_image