ocrd_segment
ocrd_segment copied to clipboard
OCR-D-compliant page segmentation
Hi, I think I have found a bug in `ocrd-segment-extract-lines`: I cannot prove to 100%, but I think I see my environment, that the lines are not extracted (no images...
Version 0.1.20, ocrd/core 2.33.0 I have a PAGE file, which does not have any real content - like this: ``` ``` If I call `ocrd-segment-extract-lines`, I get an expection like...
> If I understand correctly the idea behind these metrics are taken from "rethinking semantic segmentation evaluation" paper, but could you explain to me how could I obtain AP,TPs,FPs,FNs for...
The multi-match overlap algorithm (necessary to calculate over- and undersegmentation) still has a glitch: it will create fake/redundant pairings if either side has a segmentation that already overlaps locally. For...
Even if we lack a free GUI, we should strive to generate [the layouteval schema](http://schema.primaresearch.org/PAGE/eval/layout/2019-07-15) from [PAGE](https://github.com/PRImA-Research-Lab/PAGE-XML/blob/master/layout-evaluation/schema/layouteval.xsd) as evaluation report. Currently, our output looks like this: example report ```JSON {...
ocrd-segment-repair has the optional operations "plausibilize" and "sanitize" – I have no idea what this exactly does :) I would prefer something like this: * shrink-regions-to-hull-of-lines * whatever-plausibilize-does There seems...
I processed the data from the diverse dataset with ~1000 images from the dropbox. The diff will be unreadable and github does not like to show large Notebooks, so I...

We should have heuristics to check for - polygon containment (overlapping regions, word outside line etc.) - artifacts from annotation like point or line-like regions - lines with (way) too...