lonelydancer issues

Results 16 issues of


                                            lonelydancer

bart topk sample result is wrong

I'm using a bart model for text_generation.i modified https://github.com/lonelydancer/algorithm/blob/master/ls_bart2.py https://github.com/lonelydancer/algorithm/blob/master/hf_bart_export.py the generated text is wrong. ![image](https://user-images.githubusercontent.com/548443/148677299-1450fc72-8f7d-40a5-8587-ca426bf21ac0.png)

about Bart export

in examples/inference/python/export/export_bart.py, why do u do ***Fix encoder layer {} LayerNorm scale and bias*** ***Fix decoder layer {} LayerNorm scale and bias*** it seems to assign the weight of layer(i+1)...

GPU-Utils is low 1%

### Describe the bug run the example in Get Started in 60 Seconds ![image](https://user-images.githubusercontent.com/548443/127103356-984ec2c8-2c09-4766-accb-c0c65186c3ce.png) ### Context - **OS** Ubuntu18.04 - **Hardware** Tesla 80k, cuda 10.1,cudnn7.0 - matchzoo 2.2.0, tensorflow2.2.0, keras2.3.0...

bug

获取data_list的时候要一次都读到内存里，如果数据量比较大，内存会爆掉。一般大家用什么办法解决?

RT 据我了解keras&paddle都可以用generator来解决。不知道pytorch用什么方法解决。

RNN layer backward problom

Hi, Why ds = dsv + diff_s what is the difference between diff_s and ds? I'am confused. thank you!

有关NSPModel训练

1）我看paper中的NSPModel，“To select the most appropriate responses generated by the fine-grained generation model, the evaluation model is trained to estimate the coherence of the responses.” 理解为用stage 2.1生成的候选 + label 做分类model 而代码中的...

NSP reader中的mask策略有时会使tgt_label采样为空，导致报错（paddle1.8版本）

你好，我在一些数据上重训nsp model，发现mask策略会使tgt_label采样为空。具体在nsp_reader.py 的_pad_batch_records函数中 batch_mask_token_ids, tgt_label, tgt_pos, label_pos = mask( batch_tokens=batch_token_ids, vocab_size=self.vocab_size, bos_id=self.bos_id, eos_id=self.eos_id, mask_id=self.mask_id, sent_b_starts=batch_tgt_start_idx, labels=batch_label, is_unidirectional=False) 而mask策略，多次采样有时候prob 均> 0.15 ，导致mask_label、mask_pos都为空。我在这块多次采样直到非空，暂时解决了这个问题。

lonelydancer