docling icon indicating copy to clipboard operation
docling copied to clipboard

consolidate advanced chunker notebook

Open vagenas opened this issue 1 year ago • 0 comments

Main improvements with this PR:

  • Set chunk.text directly to updated text (including any headings, captions)
  • Add typing
  • switch to list comprehensions where possible
  • encapsulate all methods within new chunker implementation
  • use dataclass instead of unmanaged dictionary
  • list dependencies in setup installation line

vagenas avatar Nov 11 '24 16:11 vagenas