Nianhui Guo
Nianhui Guo
遇到了同样的问题,层主解决了吗?
> 就是主页设计的那些模块 遇到了同样的问题,层主解决了吗?
Hello, I‘ve met the same problem, I also could not get the right results for W1A1 STS-B (around 68 compared to 71 reported in the paper). May I ask whether...
That is also difficult for me. I have also tried most W1A2 experiments (with a clear accuracy gap) and want to cite and compare BiT in my paper, but the...
> I can not get the accuracy shown in the paper in most w1a2 or w1a4 tasks and the accuracy gap is about 10 points. Maybe the released version is...
> I can reproduce the 1-1-1 BERT for all datasets without multi-distillation. But for 1-1-4 and 1-1-2 BERT, my results are way off. Is anyone @kongds @NicoNico6 @TTTTTTris @likethesky @Celebio...
I can not reproduce most W1A2 experiments by simply tuning abits from 1 to 2. More details of the multi-stage distillation are needed to close the performance gap.
Hi, The MMA means the Matrix Multiplication API in tensorcore library. Since I am working on the Binary Neural Network, I am wondering if it is possible to write a...