ParsCit
ParsCit copied to clipboard
Thesis Work
ParsCit-Trainingssystem-Extension
Additional Tools for training and testing CRF++ models for ParsCit. Using BibTeX-Data and LaTeX (customized .bst files) for automatical building and testing models.
Usage (in /bin folder):
- train.pl (gets a bibtex-file and optional additial crfpp-formated data as input and builds a crfpp model)
- test.pl (gets testdata in form of a bibtex file or a tagged references file plus the name of the model file and outputs data about the accuracy of the model)
Additional Requirements:
- LaTeX installation (to run latex programmatically set: /usr/share/texlive/texmf/web2c/texmf.cnf -> change openout_any = p to = r )
Extension structure:
- /lib/Trainer: Modules for converting BibTeX data to tagged references and for converting tagged references to plain references + configuration module
- /resources/tex: original and modificated bibtex style files for converting BibTeX data to tagged and untagged references
- /resources/traindata: example train data
- /resources/traindata: example test data