docling icon indicating copy to clipboard operation
docling copied to clipboard

Image is not placed at `pictures` section

Open lordofthejars opened this issue 3 months ago • 7 comments

Question

I am using Docling with Docling Serve 1.8.0 I am setting the following document https://arxiv.org/pdf/2311.18481 and with the default options I produce the JSON document.

The document has a picture on page two, but in the pictures section of the JSON document, the image entry is there, but not the image content (base64). What option do I need to set to save the image embedded there?

Thank you very much

lordofthejars avatar Nov 10 '25 21:11 lordofthejars

Hi @lordofthejars! I'm Dosu and I’m helping the docling team.

To embed image content as base64 in the pictures section of your JSON output, set the option image_export_mode to "embedded" and ensure include_images is set to true in your request. The default options do not guarantee base64 embedding—these settings are required to include the actual image data in the JSON. Avoid setting image_export_mode to "placeholder", as that will only output text placeholders instead of the image data. If you want to force all images to be processed (including small ones), you can also set picture_description_area_threshold to 0.0, but this is usually not necessary for standard PDF images.

References:

To reply, just mention @dosu.


How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other  Chat with Dosu Join Discord Share on X

dosubot[bot] avatar Nov 10 '25 21:11 dosubot[bot]

@dosu is it possible this is not supported in docling serve?

lordofthejars avatar Nov 10 '25 21:11 lordofthejars

Embedding images as base64 in the JSON output is supported in docling-serve, including version 1.8.0. To get the image content in the pictures section, set image_export_mode to "embedded" (which is the default) and ensure include_images is true in your request options. If the image content is still missing, it's likely due to a configuration issue, a bug, or a problem with how the output is being serialized—not because the feature is unsupported. This behavior is documented in the official usage guide and confirmed in the codebase (source).

To reply, just mention @dosu.


How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other  Chat with Dosu Join Discord Share on X

dosubot[bot] avatar Nov 10 '25 21:11 dosubot[bot]

@dosu sorry I am using the docling serve ui which maybe the option is not supported?

lordofthejars avatar Nov 10 '25 22:11 lordofthejars

The Docling Serve UI does support setting the "Image Export Mode"—you can choose "Embedded" to request base64-encoded images in your JSON output. This option is available in the Options panel as a radio button labeled "Image Export Mode" with choices like "Embedded", "Placeholder", and "Referenced"[^1]. However, there isn't a separate UI control for "include_images"; it defaults to true, so images should be included when "Embedded" is selected. If you're still not seeing base64 image data in the JSON, it could be due to a bug or a limitation in how the UI processes or displays the output, not because the option is unsupported.

[^1]: See the UI code for details

To reply, just mention @dosu.


How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other  Chat with Dosu Join Discord Share on X

dosubot[bot] avatar Nov 10 '25 22:11 dosubot[bot]

Ok then it seems a bug

lordofthejars avatar Nov 10 '25 22:11 lordofthejars

Can confirm it, neither embedded or placeholder options output json with the image item.

vku-ibm avatar Nov 11 '25 12:11 vku-ibm