python-docx2txt
python-docx2txt copied to clipboard
How to differentiate between header text vs paragraph text?
Hi! I absolutely love this project. Quick question though.
After processing a document and printing the result, is there a way to see what is header text vs what is just a paragraph?
Right now all text, including headers, are being printed with the paragraph text.
Thanks!
This should be possible with small changes in the code. If you look at the code, text from header, main document and footer are appended to a single 'text' variable. You can collect these text in different variables and print as per your need.
Feel free to send a PR for this.