docconv icon indicating copy to clipboard operation
docconv copied to clipboard

Converts PDF, DOC, DOCX, XML, HTML, RTF, etc to plain text

Results 40 docconv issues
Sort by recently updated
recently updated
newest added

Hello, Anyone could help me? how to use this lib on a Windows machine? because it needs to install the dependencies. Any tutorial? thank you

That are the defaults for docd, but that doesn't apply to library usage generating the problem seen in the issue #78. This PR can have a drawback if you intentionally...

`.tif` files should map to the `image/tiff` mime type. List of official mime types: https://www.iana.org/assignments/media-types/media-types.xhtml From MDN web docs: https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/MIME_types/Common_types

I use this as a library and want to be able to send all output to my own logger, so I propose this change

Line 102 103 in this doc. go causes a system deadlock, mainly because the coroutine implemented above failed to add valid data to the channel ``` body :=

Hi, Sometimes, the case in Content-Types.xml and zipped file names do not match. Maybe not an issue on Windows, but is is an issue, when such file is searched in...

So I was trying to parse content from multiple document formats and turns out it works for other document formats `pdf`, `doc` etc. but not for html files somehow below...

Hi, I was having issues when trying to build the code on MacOS targeting Linux, so I created a script to build the code on the Docker image instead, with...

Hi, when xml is not encoded in utf-8, decoder requires charset reader. Credits: https://stackoverflow.com/questions/6002619/unmarshal-an-iso-8859-1-xml-input-in-go/32224438#32224438