Wen Sun

Results 7 issues of Wen Sun

This fix is patched to distributed training employment chapter.

contributor
status: proposed

更新集合通信操作的中文文档,包括: - paddle.distributed.all_gather - paddle.distributed.all_reduce - paddle.distributed.alltoall - paddle.distributed.broadcast - paddle.distributed.reduce - paddle.distributed.recv - paddle.distributed.irecv - paddle.distributed.send - paddle.distributed.isend - paddle.distributed.scatter - paddle.distributed.reduce_scatter - paddle.distributed.new_group 增加集合通信操作的中文文档,包括: - paddle.distributed.destroy_process_group - paddle.distributed.alltoall_single...

Fix a spelling mistake, and add a missing line in 3-7-4.

A larger context size is likely to be helpful for many downstream tasks. Recently proposed researches have extended LLaMA models' context size to 8k and beyond via RoPE scaling. Do...

顺便改了一些笔误(

Hi DeepSeek Team, First of all, thank you for open-sourcing **DeepSeek-V3.1** and sharing the impressive benchmark results. Our team has been working to reproduce the results reported in your official...

stale