Martin Hammarstedt
Martin Hammarstedt
**Snakemake version** 8.3.1 **Describe the bug** When using `script:` with a Python script, execution fails if the path to the Python executable contains whitespace. I think the fix should be...
Make it possible to get the number of a certain structural element that contains hits, e.g. number of sentences or texts.
When annotating a corpus, save metadata in some way (e.g. as a header in the XML export or in a sidecar file) about the version of Sparv used, version or...
If `export.word` is set to an annotation with different string lengths than the original source text, the pretty XML export will produce incorrect text output. This is probably because of...
Add a simple way for the user to add attributes with constant values to other annotations. For example, adding an author or title to the `text` annotation without having to...
We need to be able to differentiate between empty annotation values (e.g. the token has been annotated, but with no value as a result) and when no annotation has been...
Make it possible to extract metadata from source filenames in the `text_import` module. Maybe using regular expressions or some other type of pattern. We really shouldn't need this for `xml_import`....
Currently each importer is limited to supporting source files with one specific file extension. Make it possible to support more than one.
Currently much of the tokenization configuration is done via a file in the Sparv data dir. Investigate if we can move most or all of this to the corpus config.
Snakemake har en flagga som resulterar i att anledningen till att en regel körs skrivs ut (t.ex. saldo körs om för att pos-annotationen är nyare). Lägg till stöd för detta...