TTTTTTris
TTTTTTris
Hello, I‘ve met the same problem, but I could not get the right results for W1A1 (around 52 accuracy on RTE), and when I try to train W1A2, the result...
The results of STS-B are 67.7(W1A1 w/o multi-distill), 73.5(W1A2), and 58.0(W1A1 W multi-distill), still lower than the paper. I didn't use data parallel. ------------------ Original ------------------ From: ***@***.***>; Date: Wed, Nov...
I can not get the accuracy shown in the paper in most w1a2 or w1a4 tasks and the accuracy gap is about 10 points.