Valkyria-lenneth

Results 7 comments of Valkyria-lenneth

The structure of Longformer Attention Windows makes your input sequence length must be the multiple of windows length. To use it, you can pad your input sequence to 512 or...

确实是这样, longformer即便是long, 也只不过是把输入长度扩展到1K,4K而已

双塔模型我没有用过, 但是[cls]作为表征整个文本的向量,是可以用来相似度计算的.

我记得transformer是3.2 torch1.1 cuda 11, 不过只要模型没问题, 参数直接加载就行? forward报错的话,可能API有变动

你可以看一下Longformer的源码设置,这个模型和原longformer结构完全一致,他那边的设置时间太长了,我不记得了

Is there any progress?

I see. It is a spelling mistake, it should be `LongformerZhForMaskedLM `. Thanks for your reply.