Fan Jiang comments

Repositories
Issues
Comments

Results 3 comments of


                                            Fan Jiang

Model structure redundancy

Hi @grig-guz! I have also implemented this model using Pytorch but always have a performance gap of around 1.2 F1 scores with the official results reported on paper. How does...

Possible for Training on multiple GPUs?

Thanks for your reply. I agree that randomly truncating the whole document into several consecutive sentences is essential for training, just like what Joshi has done on his BERT baseline...

Possible for Training on multiple GPUs?

Yes, the backward coreference score consumes far more memory than the forward one. I also notice that you just use the sentence that mention x resides in as the context...