Xianjie Qiao

Results 1 issues of Xianjie Qiao

I'm confused about the qkv weight initialization in lightseq. If qkv_w is initialized from the existed weights, there are two ways to init it: (assume qkv shape is [m,k], and...