Ioana Baldini
Ioana Baldini
thanks for pointing this out @pltrdy removing that line allows the code to run on the gpu and it's going 20x faster. so you pretty much made my life go...
I'm new to everything related to this work, but learning. From what I read in the docs and looking at the code, this is my understanding. https://www.tensorflow.org/tutorials/using_gpu explains that `with...
Yes, I thought this might be the case, however, the same is true for deberta v2 if I remember correctly and the answer for that is different. What I was...
I can confirm that if I run one job first that processes the dataset, then I can run any jobs in parallel with no problem (no write-concurrency anymore...).
I'm new to all this, so take my ideas with a serious grain of salt. My guess based on my understanding of the paper and your comment, the POS and...
In fact, if you look at the code, there is a preprocessing step that shows how the entities are set: https://github.com/momohuang/FusionNet-NLI/blob/master/prepro.py#L105 This makes me more confident that I'm right. Hope...
Thank you for clarifying. Wondering why the POS and NER get trained as well when they could potentially get initialized from "traditional" NLP tools. Do you understand why training them...
I'm getting a similar error. I'm using the dev branch and TF 1.1
Same issue here as well.
Funnily enough, I get this only when I use learning rate decay.