foliapy
foliapy copied to clipboard
An extensive Python library for dealing with FoLiA (Format for Linguistic Annotation) documents, a rich XML-based format for linguistic annotation finding application in Natural Language Processing (N...
When a text content has an offset without a explicit reference, the offset is per definition relative to the text content of the nearest structure parent. In general this is...
Hello and thanks for the repo, I think I have an issue with text validation. I have created a folia Doc with sentence elements and their text. I am machine...
Given this example: ```xml Appelperentaart ``` folia2txt produces: ``Appeltaart`` Which is maybe consistent with [the docs](https://folia.readthedocs.io/en/latest) , although it is not explicitly forbidden (``should`` isn't ``may not``) FoLiA-2text gives: ``Appelperentaart``,...
when running ``foliavalidator examples/gaps.2.0.0.folia.xml -o`` the CDATA block in the `` is discarded. IN: ```xml This is the cover of the book ``` OUT ```xml This is the cover of...
I added a new variant for Arabic to the examples: `arabic.2.2.1.folia.xml` with offset information everywhere. Both folialint and foliavalidator accept this file. NICE! BUT: when I replace the offset in...
when running foliavalidator on `examples/full-legacy.1.5.folia.xml` something strange happens: ``foliavalidator ../FoLiApy/folia-repo/examples/full-legacy.1.5.folia.xml --keepversion -o`` The `` block is outputted normally, but then: ```xml proycon Stemma ``` So, no formatting anymore?
https://github.com/proycon/folia/blob/master/test/example.xml#L1298 used to have wrong offset and was not detected
Error showed in FLAT, reported by @roelsmeets: 
Right now, the XML tree is discarded unless `mode=XPATH` was a keyword argument. If the `xpath` function is called on a document not in this mode, it fails with `AttributeError:...