Pretrained-Language-Model
Pretrained-Language-Model copied to clipboard
Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.
使用图文对进行测试,模型是vit_b模型进行测试,图文相似度[0.0937,0.0846,0.0857]
Hi, In the paper, the linear transformation is performed to match the hidden representation between student and teacher embeddings. In the code, this is implemented using `fit_dense`, but this layer...
作者您好,我用同样的图文输入,试验了一下wukong_vit_b和wukong_vit_b_g两个模型,相似度结果是wukong_vit_b:[0.0855, 0.0848, 0.0854, 0.0881],wukong_vit_b_g:[0.0521, 0.0429, 0.0910, 0.1226],gt是index=3的位置,结果未作softmax。 wukong_vit_b的结果在小数点后3位才能体现出来,而wukong_vit_b_g区分性反而好一些,这样看token-wise好像并没有起到很好的效果。请问,你们那边实验的也是这样的吗?
关于悟空的数据集
我看ILSVRC的数据集有很多个,关于悟空的测试集中,你们使用的是哪一个?能不能给我个链接
Dear authors of TinyBERT, Thanks for the great work. TinyBERT is really a great work and clarifies the ways to do distillation. We would like to apply your final models...
if i want import my chinese data in TinyBert ,what should I do?
Hi! I've been trying to measure MLM perplexity for TinyBERT model (in particular, [tinybert6l](https://huggingface.co/huawei-noah/TinyBERT_General_6L_768D)), and I keep getting inconsistent results. Looks like the MLM head for TinyBERT is not loaded...
Thanks for your work Where can I directly download the pre-trained model and vocab you mentioned in **CeMAT** ReadMe?
AsymLsqQuantizer 要求alpha(即input_clip_val)是正数,同时LSQ必须使用可学习的参数。请问是怎么保证clip_val是可学习的同时学习到的一定是正数呢?我在代码中没有找到这部分的限制。感谢解答。