ParsCit icon indicating copy to clipboard operation
ParsCit copied to clipboard

Thesis Work

Open maboberlin opened this issue 9 years ago • 0 comments

ParsCit-Trainingssystem-Extension

Additional Tools for training and testing CRF++ models for ParsCit. Using BibTeX-Data and LaTeX (customized .bst files) for automatical building and testing models.

Usage (in /bin folder):

  • train.pl (gets a bibtex-file and optional additial crfpp-formated data as input and builds a crfpp model)
  • test.pl (gets testdata in form of a bibtex file or a tagged references file plus the name of the model file and outputs data about the accuracy of the model)

Additional Requirements:

  • LaTeX installation (to run latex programmatically set: /usr/share/texlive/texmf/web2c/texmf.cnf -> change openout_any = p to = r )

Extension structure:

  • /lib/Trainer: Modules for converting BibTeX data to tagged references and for converting tagged references to plain references + configuration module
  • /resources/tex: original and modificated bibtex style files for converting BibTeX data to tagged and untagged references
  • /resources/traindata: example train data
  • /resources/traindata: example test data

maboberlin avatar Feb 10 '17 19:02 maboberlin