preprocess
preprocess copied to clipboard
Sentence splitter uses unbounded memory in -k mode
When used in -k mode, one would expect the sentence splitter to use a small amount of RAM, just enough to store a single line. However, it actually stores the entire split file before printing.
https://github.com/kpu/preprocess/blob/344208c6c30ce0ece41ead02467c30ed10b0413c/moses/ems/support/split-sentences.perl#L119
@jelmervdl