markitdown
markitdown copied to clipboard
fix docx parse error (docx testcase: \n in alt)
In the test case of test.docx, there are actually some problems with the parsing of images.
It is not a problem with the image URI, but with the parsing of the alt.
The doc document allows the alt of the image to be multi-line text, but markdown does not actually allow alt to wrap.
My change is to replace the line breaks of multi-line alt.
Reproduce: run testcase with test.docx in testfile folder. (keep_data_uri=True)