saul SRL Annotator

Annotator for the Semantic Role Labeler

[x] ~Change the DataModel to be an object~
[x] Add Annotator class
[x] Remove redundant SRLConfigurator.java
[ ] ~Train models for Predicate Sense and populate Sense information in annotator~
[x] Unit Test for Annotator
[ ] ~Move constants used in Annotator to someplace better~ (Dropping this)
[x] Retrain and deploy all the models.

I'm dropping the requirement for training Predicate Sense for this PR. I can see if the illinois-verbsense package can be integrated as a separate PR.

Apr 14 '17 06:04 bhargav

@bhargav this sounds great. However, do you think it is possible to populate all the data in a single datamodel object efficiently? the reason that I defined a class for SRL datamodel was to be able to to have small graphs and then integrate them.

Apr 16 '17 23:04 kordjamshidi

@kordjamshidi Population works fine for small datasets (examples the test set has ~2000 sentences) but it takes much longer to populate large number of sentences.

I did some profiling and most of the time during population of large graphs is spent in the textAnnotationToRelationMatch matching function. Trying to debug the issue with this behavior.

Apr 17 '17 16:04 bhargav

Yes, right. I am sure the issue was related to establishing matching edges. I hope you can see what is the actual issue.

Apr 17 '17 16:04 kordjamshidi

I have trained and deployed models trained with PARSE_STANFORD.

May 02 '17 06:05 bhargav

@bhargav thanks, I will review this, but I hoped we can find a better solution for the population with the matching sensor. Maybe in another PR then.

May 02 '17 11:05 kordjamshidi

Semaphore failed with the following exception related to MapDB.

[error] Uncaught exception when running edu.illinois.cs.cogcomp.saulexamples.nlp.SemanticRoleLabeling.ModelsTest: org.mapdb.DBException$DataCorruption: Header checksum broken. Store was not closed correctly, or is corrupted sbt.ForkMain$ForkError: Header checksum broken. Store was not closed correctly, or is corrupted

May 02 '17 21:05 bhargav

Thanks! Looks good to me!

Merge call with @kordjamshidi

May 03 '17 05:05 danyaljj

Did any result change? @bhargav

May 03 '17 11:05 kordjamshidi

Reverting to the previous data models. Models trained with PARSE_STANFORD were couple of points lower than PARSE_GOLD on per-argument evaluation. But performance of the pipeline model using the PredicateArgumentEvaluator leads to similar results.

Model	Precision	Recall	F1
VerbSRL (PARSE_GOLD)	65.56	64.08	64.81
Verb SRL (PARSE_STANFORD)	65.52	63.48	64.48

May 10 '17 18:05 bhargav