SRL Annotator
Annotator for the Semantic Role Labeler
- [x] ~Change the DataModel to be an
object~ - [x] Add Annotator class
- [x] Remove redundant
SRLConfigurator.java - [ ] ~Train models for Predicate Sense and populate Sense information in annotator~
- [x] Unit Test for Annotator
- [ ] ~Move constants used in Annotator to someplace better~ (Dropping this)
- [x] Retrain and deploy all the models.
I'm dropping the requirement for training Predicate Sense for this PR. I can see if the illinois-verbsense package can be integrated as a separate PR.
@bhargav this sounds great. However, do you think it is possible to populate all the data in a single datamodel object efficiently? the reason that I defined a class for SRL datamodel was to be able to to have small graphs and then integrate them.
@kordjamshidi Population works fine for small datasets (examples the test set has ~2000 sentences) but it takes much longer to populate large number of sentences.
I did some profiling and most of the time during population of large graphs is spent in the textAnnotationToRelationMatch matching function. Trying to debug the issue with this behavior.
Yes, right. I am sure the issue was related to establishing matching edges. I hope you can see what is the actual issue.
I have trained and deployed models trained with PARSE_STANFORD.
@bhargav thanks, I will review this, but I hoped we can find a better solution for the population with the matching sensor. Maybe in another PR then.
Semaphore failed with the following exception related to MapDB.
[error] Uncaught exception when running edu.illinois.cs.cogcomp.saulexamples.nlp.SemanticRoleLabeling.ModelsTest: org.mapdb.DBException$DataCorruption: Header checksum broken. Store was not closed correctly, or is corrupted sbt.ForkMain$ForkError: Header checksum broken. Store was not closed correctly, or is corrupted
Thanks! Looks good to me!
Merge call with @kordjamshidi
Did any result change? @bhargav
Reverting to the previous data models. Models trained with PARSE_STANFORD were couple of points lower than PARSE_GOLD on per-argument evaluation. But performance of the pipeline model using the PredicateArgumentEvaluator leads to similar results.
| Model | Precision | Recall | F1 |
|---|---|---|---|
| VerbSRL (PARSE_GOLD) | 65.56 | 64.08 | 64.81 |
| Verb SRL (PARSE_STANFORD) | 65.52 | 63.48 | 64.48 |