unilex-transcript icon indicating copy to clipboard operation
unilex-transcript copied to clipboard

Get semantic HTML from PDFs, recover lost text, tables, data... in bulk.

Results 5 unilex-transcript issues
Sort by recently updated
recently updated
newest added

This results in output files in the host mounted volume being owned by the host user

I cant install `pdf2htmlex` on fedora, so i tried using the docker approach. It wont create output files for some reason. i looked at how you're launching the image and...

It cannot create html tables from tables in the pdf and totally skips 'em, including the text. Not very useful. Tested on Intel manual (see: Attached file) [vol2a_2016_test1.pdf](https://github.com/fmalina/unilex-transcript/files/7775001/vol2a_2016_test1.pdf)

how to use this if I have the pdf2htmlEX setup up and running that is Debian package on Ubuntu 22.04 please let me know as I am already converting pdf...

I just did a fresh install of pdftranscript in a venv and I'm getting import errors when I try to run it. I have installed the pdf2htmlEX debian package and...