Is there any way to extract the table into markdown format?
I want to extract the table in .docx file into markdown format, while maintaining the position of the table in the document. So I can't use python-docx document.paragraghs and document.tables to handle paragraghs and tables separately (this will destory the positional relationship between them).
docx2python is very easy to use. I would like to know whether docx2python can save tables in markdown format, or whether it can separate tables, images and paragraphs in output.body. Thank you!
I am going to leave this issue open for a bit and thing about how this might be seamlessly accomplished. Until then, here’s a script that will identify tables for you.
https://github.com/ShayHill/transpose_docx_tables
As of Docx2Python v 3.0.0, tables are guaranteed to be nxm (n rows by m columns) and are straightforward to identify. See details near the top of the README file. I've also left an example of exporting tables as markdown in the tests folder. It's referenced in the README.