saul icon indicating copy to clipboard operation
saul copied to clipboard

SRL Annotator

Open bhargav opened this issue 8 years ago • 9 comments

Annotator for the Semantic Role Labeler

  • [x] ~Change the DataModel to be an object~
  • [x] Add Annotator class
  • [x] Remove redundant SRLConfigurator.java
  • [ ] ~Train models for Predicate Sense and populate Sense information in annotator~
  • [x] Unit Test for Annotator
  • [ ] ~Move constants used in Annotator to someplace better~ (Dropping this)
  • [x] Retrain and deploy all the models.

I'm dropping the requirement for training Predicate Sense for this PR. I can see if the illinois-verbsense package can be integrated as a separate PR.

bhargav avatar Apr 14 '17 06:04 bhargav

@bhargav this sounds great. However, do you think it is possible to populate all the data in a single datamodel object efficiently? the reason that I defined a class for SRL datamodel was to be able to to have small graphs and then integrate them.

kordjamshidi avatar Apr 16 '17 23:04 kordjamshidi

@kordjamshidi Population works fine for small datasets (examples the test set has ~2000 sentences) but it takes much longer to populate large number of sentences.

I did some profiling and most of the time during population of large graphs is spent in the textAnnotationToRelationMatch matching function. Trying to debug the issue with this behavior.

bhargav avatar Apr 17 '17 16:04 bhargav

Yes, right. I am sure the issue was related to establishing matching edges. I hope you can see what is the actual issue.

kordjamshidi avatar Apr 17 '17 16:04 kordjamshidi

I have trained and deployed models trained with PARSE_STANFORD.

bhargav avatar May 02 '17 06:05 bhargav

@bhargav thanks, I will review this, but I hoped we can find a better solution for the population with the matching sensor. Maybe in another PR then.

kordjamshidi avatar May 02 '17 11:05 kordjamshidi

Semaphore failed with the following exception related to MapDB.

[error] Uncaught exception when running edu.illinois.cs.cogcomp.saulexamples.nlp.SemanticRoleLabeling.ModelsTest: org.mapdb.DBException$DataCorruption: Header checksum broken. Store was not closed correctly, or is corrupted sbt.ForkMain$ForkError: Header checksum broken. Store was not closed correctly, or is corrupted

bhargav avatar May 02 '17 21:05 bhargav

Thanks! Looks good to me!

Merge call with @kordjamshidi

danyaljj avatar May 03 '17 05:05 danyaljj

Did any result change? @bhargav

kordjamshidi avatar May 03 '17 11:05 kordjamshidi

Reverting to the previous data models. Models trained with PARSE_STANFORD were couple of points lower than PARSE_GOLD on per-argument evaluation. But performance of the pipeline model using the PredicateArgumentEvaluator leads to similar results.

Model Precision Recall F1
VerbSRL (PARSE_GOLD) 65.56 64.08 64.81
Verb SRL (PARSE_STANFORD) 65.52 63.48 64.48

bhargav avatar May 10 '17 18:05 bhargav