Adding color to the Text's metadata feat/text-color
Is your feature request related to a problem? Please describe. I have a branded manual which section title's are in a specific color. I would like to chunk the PDF into section using color information.
Describe the solution you'd like A clear and concise description of what you want to happen. When using partition_pdf with "fast" strategy, the color of the text is stored in the metadata. (And the documentation reflects it).
Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered. I already tried to use the "by_title" chunking system but some text.category are wrong or the section is chunked to be 500 chars aprox despite to set the max_partition to None.
Additional context Add any other context or screenshots about the feature request here.
Using unstructured from docker image.