Robert Sachunsky

Results 944 comments of Robert Sachunsky

> > In METS, we could also use some information on processing dates, e.g. `mets:agent/mets:note/@ocrd:date` (of `xsd:dateTime`). > > Indeed, I will need to improve the page-to-alto conversion soon-ish and...

I don't know why the local docker installation does not pick up textract2page (assuming you ran `make docker` – the version on Dockerhub is hopelessly outdated), but have you tried...

How did you install textract2page – in a local venv (as advised by the readme)? Also, if you did pull the PR: you probably have to specify `pip install ./textract2page`...

Oops, sorry – turns out textract2page recently introduced a packaging bug, which affected editable installs. Please pull from the PR again (tip should be at bf89b08)!

This `ocrd__workspace` could also do what is needed for #938 – i.e. downloading locally (without changing the href).

Simplest test I know of so far is still [binarization](https://github.com/OCR-D/core/issues/1195#issuecomment-1973378620).

(And for some reason `xmlstarlet val` does **not** pick up the duplicate `@ID` problem, only `xmllint` does.)

I concur. Probably as another patch to OcrdPage / ocrd_page_generateds, say `TextRegionType.concatenate_TextEquiv()` etc. > This could be a dupe Not that I recall.

I just tried to merge master into this locally, but git failed in detecting the src/ renames – it marked ocrd/decorators/__init__.py as conflicting, but did not apply/retain any of my...

wowsa! CI seems to fail miserably... ``` tests/cli/test_bashlib.py F........ [ 1%] ... tests/test_mets_server.py FF..............FFFF [ 50%] ... tests/test_workspace.py ...................................F..... [ 80%] ... ``` Maybe it would be better to cherry-pick...