Andreas van Cranenburgh
Andreas van Cranenburgh
Hi Allen, I refer to this material in my courses, but unfortunately the links are broken, e.g.: https://de.dariah.eu/tatom/visualizing_trends.html I mailed the DARIAH team about this several times, and their response...
- [ ] hierarchical subcorpus selection; handle corpora with large number of sections - [ ] query cancellation: pressing stop in browser should cancel the query. - [ ] pagination:...
https://github.com/explosion/wheelwright
multiprocessing pools work fine unless any kind of error condition arises... - [ ] properly detect segmentation faults, out of memory, &c. `concurrent.futures` does this, but doesn't take an `initializer`...
Tessil/ordered-map might be a better trade off than spp::sparse_hash.
When these are installed, the installed script is wrong: ``` $ cat `which discodop` #!/usr/bin/python3 # EASY-INSTALL-SCRIPT: 'disco-dop==0.5rc1','discodop' __requires__ = 'disco-dop==0.5rc1' __import__('pkg_resources').run_script('disco-dop==0.5rc1', 'discodop') ``` The workaround is to remove these...
e.g., a pathological sentence with >1000 words will be too deep to recurse when binarized. - Any function that directly recurs on the children of a tree is affected, as...
- tgrep2: generally fast, but loads corpus at every invocation, and always returns an exhaustive list of all matches; no support for discontinuous constituents. - xpath / alpinocorpus: memory hungry,...
Would allow a potentially significant speedup for treebank transformations and grammar extraction. Wishlist: - represent all treebank information: functions, morphology, lemmas, &c. - combine indices and words in one datastructure...
I'm trying to run the dcoref system on a plain text file and want to get the output in CoNLL 2012 format. I've tried several things: ``` $ ./corenlp.sh -annotators...