Mryangkaitong
Mryangkaitong
这里都是笔者之前的一些demo用以入门,目前主要做深度学习方面的项目,这里就没有上更多有深度的项目了,后续有机会会更新。
> > > > 请问在ChineseNRE/data/people-relation/data_util.py的54行 > > set_ids = range(1, len(set_words)+1) > > 那么“所有单词”列表的id索引是从1开始的,而在使用nn.Embedding时,取某个单词的vec索引是从索引0开始的,不就混乱了吗? > > 为什么不直接这样set_ids = range(len(set_words))写呢? > > 谢谢 > > 0应该是用来给unknown word了,在后面找一下 第61行程序 id2word[len(id2word)+1]="UNKNOW" 不是将最后一个id号设为unknown word了吗?
> > > > 请问在ChineseNRE/data/people-relation/data_util.py的54行 > > > > set_ids = range(1, len(set_words)+1) > > > > 那么“所有单词”列表的id索引是从1开始的,而在使用nn.Embedding时,取某个单词的vec索引是从索引0开始的,不就混乱了吗? > > > > 为什么不直接这样set_ids = range(len(set_words))写呢? > > > > 谢谢...
> 非word应该是为了凑batch在后面添加的padding吧。 > > 代码你可以改一改试一下效果,我实在不记得代码细节了。 @Mryangkaitong 好的,谢谢
> > > sts-b任务的数据是有label的,为什么是无监督训练呢?不解 https://blog.csdn.net/weixin_42001089/article/details/109499760
> I have a similar question that why there is a peak in "loss/total" and "loss/policy"? Is your kl becoming more and more negative? According to my understanding, the method...
Thanks a lot, it worked
> Note that the KL-divergence (`objective-kl`) should never be negative. This can happen if you generate text with settings that are not pure sampling (e.g. early stopping strategies, minimum generation...
> Note that the KL-divergence (`objective-kl`) should never be negative. This can happen if you generate text with settings that are not pure sampling (e.g. early stopping strategies, minimum generation...
> > > Note that the KL-divergence (`objective-kl`) should never be negative. This can happen if you generate text with settings that are not pure sampling (e.g. early stopping strategies,...