unstructured icon indicating copy to clipboard operation
unstructured copied to clipboard

feat: ability to skip non-plain-text element types in chunk_by_title()

Open cragwolfe opened this issue 2 years ago • 0 comments

Is your feature request related to a problem? Please describe.

chunk_by_title is a great way to combine related text elements. however, the caller may not want to combine all element types, e.g. Table and Figure, with other element types when forming the CompositeElements.

Describe the solution you'd like

Add skip_element_types=['Table', 'Figure', <... and any other "non-plain text" elements>] to chunk_by_title, and also make this parameter accessible from partition_ functions and unstructured-ingest.

cragwolfe avatar Oct 10 '23 06:10 cragwolfe