wangyuxin87

Results 6 issues of wangyuxin87

Does 'dcn_v2_psroi_pooling_forward' means deformable position-sensitive roi pooling?

作者你好 我想问下这个tensorflow版本能达到原文精度吗? 略次还是略高?

can tutel be used with Megatron Deepspeed?

Thanks for your excellent work. However, GAU is slower than the original MHSA in my implementation, **3.5s vs 0.7s**. As I simply use "from flash_pytorch import GAU" with the default...

https://github.com/datamllab/LongLM

Thanks for your great work. When will the code be released?