PolyglotDB icon indicating copy to clipboard operation
PolyglotDB copied to clipboard

Encoding baseline duration exceeds memory limit

Open james-tanner opened this issue 6 years ago • 1 comments

I've been trying to use the encode_baseline measure for words inside of a SPADE script, currently:

with CorpusContext(config) as c:
    if not c.hierarchy.has_token_property('word', 'baseline'):
        print('getting baseline word duration')
        c.encode_baseline('word', 'duration')

This works fine on smaller corpora (like ICE-Can or Modern RP), but exceeds the memory limit (even on Roquefort) for corpora of SOTC-size and larger.

james-tanner avatar Jun 24 '19 14:06 james-tanner

@mmcauliffe any thoughts on this? I know you probably won't have time to fix before leaving, but any guidance appreciated. like, do you suspect the issue will have been resolved with your recent memory optimizations -- or does the issue seem like an actual bug?

msonderegger avatar Aug 06 '19 17:08 msonderegger