bist-parser
bist-parser copied to clipboard
Graph-based and Transition-based dependency parsers based on BiLSTMs
A Pytorch implementation of the BIST Parsers (for graph based parser only)
To be more accurate, this implementation is just a line-by-line translation from the DyNet implementation that can be found here. The techniques behind the parser are described in the paper Simple and Accurate Dependency Parsing Using Bidirectional LSTM Feature Representations.
Required software
- Python 2.7 interpreter (For Python 3 implementation, please checkout branch pytorch_python3, Thanks to Zhiqiang Xie)
- Pytorch library
Data format:
The software requires having a training.conll and development.conll files formatted according to the CoNLL data format, or a training.conllu and development.conllu files formatted according to the CoNLLU data format.
Train a parsing model
python src/parser.py --outdir [results directory] --train data/en-universal-train.conll --dev data/en-universal-dev.conll --epochs 30 --lstmdims 125 --bibi-lstm
Parse data with your parsing model
The command for parsing a test.conll file formatted according to the CoNLL data format with a previously trained model is:
python src/parser.py --predict --outdir [results directory] --test data/en-universal-test.conll --model [trained model file] --params [param file generate during training]
The parser will store the resulting conll file in the out directory (--outdir).