textract icon indicating copy to clipboard operation
textract copied to clipboard

extract text from any document. no muss. no fuss.

Results 158 textract issues
Sort by recently updated
recently updated
newest added

### Update [argcomplete](https://pypi.org/project/argcomplete) from **1.10.3** to **2.0.0**. Changelog ### 2.0.0 ``` =============================== - Truncate input after cursor. Fixes 351 (352) - Support of path completion in fish 327 (359) -...

There are small typos in: - docs/installation.rst - textract/exceptions.py Fixes: - Should read `suppressed` rather than `supressed`. - Should read `documentation` rather than `documenation`. - Should read `accommodated` rather than...

When extracting a PDF using the pdfminer method, it looks for an application called `pdf2text.py`, but the spawn package adds `.exe` to it automatically. Obviously this file doesn't exists, so...

I am trying to extract text from hundreds of thousands of PDFs using a computer cluster. I want to run commands like textract cl-exec-201666USCOC.pdf -o test1.txt -m tesseract where the...

**Describe the bug** ``` ERROR: Cannot install beautifulsoup4==4.11.1 and textract==1.6.3 because these package versions have conflicting dependencies. The conflict is caused by: The user requested beautifulsoup4==4.11.1 textract 1.6.3 depends on...

**Describe the bug** When parsing files using textract specifically '.txt' files the input/output_encoding arguments simply don't work when parsing any text **To Reproduce** Steps to reproduce the behavior: 1. Create...

**Describe the bug** I operate locally on a Mac and a simple test from a sample pdf passes locally, but fails in a Docker container. **To Reproduce** Steps to reproduce...

Added a fix for issue #342 caused by `extract_msg.Message._getStringStream` returning `None` for streams that are not found in the MSG file (this is intentional and should be handled accordingly). `ensure_bytes`...

For the moment, pip complaints about these dependency conflicts: ``` textract 1.6.5 requires argcomplete~=1.10.0, but you have argcomplete 2.0.0 which is incompatible. textract 1.6.5 requires beautifulsoup4~=4.8.0, but you have beautifulsoup4...

**Is your feature request related to a problem? Please describe.** A clear and concise description of what the problem is. Ex. I'm always frustrated when [...] **Which filetype should textract...