Martin Hammarstedt

Results 35 issues of Martin Hammarstedt

**Snakemake version** 8.3.1 **Describe the bug** When using `script:` with a Python script, execution fails if the path to the Python executable contains whitespace. I think the fix should be...

bug

Make it possible to get the number of a certain structural element that contains hits, e.g. number of sentences or texts.

enhancement

When annotating a corpus, save metadata in some way (e.g. as a header in the XML export or in a sidecar file) about the version of Sparv used, version or...

new functionality

If `export.word` is set to an annotation with different string lengths than the original source text, the pretty XML export will produce incorrect text output. This is probably because of...

bug

Add a simple way for the user to add attributes with constant values to other annotations. For example, adding an author or title to the `text` annotation without having to...

new functionality

We need to be able to differentiate between empty annotation values (e.g. the token has been annotated, but with no value as a result) and when no annotation has been...

new functionality

Make it possible to extract metadata from source filenames in the `text_import` module. Maybe using regular expressions or some other type of pattern. We really shouldn't need this for `xml_import`....

new functionality
fixed-unreleased

Currently each importer is limited to supporting source files with one specific file extension. Make it possible to support more than one.

enhancement

Currently much of the tokenization configuration is done via a file in the Sparv data dir. Investigate if we can move most or all of this to the corpus config.

enhancement

Snakemake har en flagga som resulterar i att anledningen till att en regel körs skrivs ut (t.ex. saldo körs om för att pos-annotationen är nyare). Lägg till stöd för detta...

enhancement