Ghost Screaming
Ghost Screaming
Add introduction for sequence_parallel in README.
Still work in progress.
### PR types Others ### PR changes Others ### Description Test 3d SP acc
### PR types Others ### PR changes Others ### Description Sequence Parallel逐位对齐代码。LLaMA 规模缩减到1层,mp_degree = 2,第一个step,动手SP与动半SP能够逐位对齐(前向每层输出和反向params.grad)。基于[PR 7609](https://github.com/PaddlePaddle/PaddleNLP/pull/7609)
When FP32 and FP16 model runs on A100 machine, it can be accelerated using TensorCore. Although NVIDIA declares that fp32 computation will be transferred to TensorCore automatically on A100, we...
### PR types Others ### PR changes Others ### Description Auto parallel memory and performance analysis for dynamic graph.
### PR types New features ### PR changes Models ### Description Support qwen for auto_parallel. #### 1. 在QWen上验证自动并行架构 - 对比动态图半自动并行和动态图手动并行,收敛和精度结果符合预期,涉及以下策略验证 - dp2mp2pp2 - dp2mp2pp2 + amp - dp2mp2pp2 + 动转静...
### 🐛 Describe the bug I try to train vision transformer following instructions in README.md. However, it throws an error in `imagenet1k/train.py` that there is no module named `model_zoo`. Corresponding...
### PR types Others ### PR changes Others ### Description Dynamic Auto Parallel llama 3D Sequence Parallel Convergence Verification.
bash run_pretrain_3D.sh 1 1 4 (dp_degree, mp_degree, pp_degree)