Ghost Screaming issues

Results 10 issues of


                                            Ghost Screaming

Add introduction for sequence_parallel in README.

[WIP] Add recompute support for imagen model.

Still work in progress.

[AutoParallel] Test 3d SP acc

### PR types Others ### PR changes Others ### Description Test 3d SP acc

stale

[WIP] Test for sequence parallel

### PR types Others ### PR changes Others ### Description Sequence Parallel逐位对齐代码。LLaMA 规模缩减到1层，mp_degree = 2，第一个step，动手SP与动半SP能够逐位对齐（前向每层输出和反向params.grad）。基于[PR 7609](https://github.com/PaddlePaddle/PaddleNLP/pull/7609)

stale

Set flag FLAGS_enable_cublas_tensor_op_math True in default.

When FP32 and FP16 model runs on A100 machine, it can be accelerated using TensorCore. Although NVIDIA declares that fp32 computation will be transferred to TensorCore automatically on A100, we...

[DO NOT MERGE] Auto parallel memory and performance analysis.

### PR types Others ### PR changes Others ### Description Auto parallel memory and performance analysis for dynamic graph.

stale

[AutoParallel] Support qwen for auto_parallel

### PR types New features ### PR changes Models ### Description Support qwen for auto_parallel. #### 1. 在QWen上验证自动并行架构 - 对比动态图半自动并行和动态图手动并行，收敛和精度结果符合预期，涉及以下策略验证 - dp2mp2pp2 - dp2mp2pp2 + amp - dp2mp2pp2 + 动转静...

Where is model_zoo module?

### 🐛 Describe the bug I try to train vision transformer following instructions in README.md. However, it throws an error in `imagenet1k/train.py` that there is no module named `model_zoo`. Corresponding...

[DO NOT Merge] Test dynamic auto parallel 3d sp acc

### PR types Others ### PR changes Others ### Description Dynamic Auto Parallel llama 3D Sequence Parallel Convergence Verification.

stale

[DO NOT MERGE] Dynamic Auto parallel Performance Test.

bash run_pretrain_3D.sh 1 1 4 （dp_degree, mp_degree, pp_degree）

stale