Jesse de Does comments

Results 11 comments of


                                            Jesse de Does

Text output badly formatted

Dear John, You are right, the plain text output from ALTO is execrable. The reason is that conversion takes place indirectly, ALTO --> tokenized TEI with zoning --> plain text....

Would you please provide me the contents in the lib directory?

I have added content in the lib directory. Please let me know if you have any problems!

Can OpenConvert convert plain text file to TEI?

Hello all, sorry to catch up only today - The right command line for conversion from txt to TEI is (txt not text) java -jar OpenConvert.jar -from txt -to TEI...

dependency problems

Thanks both!! I can install @PonteIneptique's version. I run into cuda issues later on, but that is most likely a problem of my local machine.

dependency problems

Thanks again! (My machine does have cuda, but it magically gets mixed up on system updates from time to time)

BE feedback

First the easy ones: - We fixed the validation issue found by Tomaz in one of the files - We removed the resp statement for linguistic annotation from the annotated...

BE feedback

- missing text: this has to do with text paragraphs which could not automatically be classified in the first step op the conversion from HTML to TEI. In the first...

BE feedback

* Using common taxonomies. We have tried to do this as much as possible now. When categories we need are missing from the common taxonomy, we add a -BE file...

BE feedback

Multipe speaker types indeed break the validation: ``` Error: Type error on line 332 column 49 of parlamint-lib.xsl: XTTE0780 A sequence of more than one item is not allowed as...

Summarizing: - Gap becomes note for the unclassified paragraphs. Some way to characterize this content would be welcome. Maybe allow `subtype="problematic_content"` or something along those lines? - We removed some...