markitdown
markitdown copied to clipboard
Python tool for converting files and office documents to Markdown.
Hi, is there some MarkItDown documentation? Thank you
It's useful to keep image data uris for later process. see https://github.com/microsoft/markitdown/issues/51
Addresses https://github.com/microsoft/markitdown/issues/89
Set up a Web API so users can use the library via a REST endpoint. This is also useful for Docker scenarios as well.
As a feature suggestion, I would love if this could allow you to plug in an LLM (such as GPT-4o) where images that are included in the content could be...
UnicodeEncodeError: 'gbk' codec can't encode character '\U0001f4a1' UnicodeEncodeError: 'gbk' codec can't encode character '\u2022'
Do you have a library that converts markdown to office?
And it wasn't going to do it. It just take output from pdfminer, which is declared as "a text extraction tool for PDF documents". Markdown is something different