LuGY
LuGY
### 🐛 Describe the bug When I ran ViT with cifar-10, I found that if using a hybrid of PP and TP, the test ACC decreased with the process of...
### Describe the feature Hi, I find that we have provided too many huge models as examples, for instance, we reshape cifar-10 to 224*224 and use ViT Huge(at least not...
### 🐛 Describe the bug When I ran gpt2-vanilla with a batch size of 64, there was a CUDA error `RuntimeError: CUDA error: an illegal memory access was encountered`. Then...
* add CI, update Dockerfile * remove useless loop in inference, add some comments to Attention * update inference test and CI * fix path * add pytest for test...
1. Add inference unit test, remove useless data utils 2. Add CI 3. Modify the wheel building script 4. Remove useless loop in inference script
### 🐛 Describe the bug See the [checkpoint function](https://github.com/hpcaitech/ColossalAI/blob/main/colossalai/utils/activation_checkpoint.py#L141) and how it used in [CheckpointModule](https://github.com/hpcaitech/ColossalAI/blob/main/colossalai/nn/layer/utils/common.py#L26). Now **the only keyword arg** can be passed in `checkpoint` function is `use_reentrant`, and can't...
## 📌 Checklist before creating the PR - [ ] I have created an issue for this PR for traceability - [x] The title follows the standard format: `[doc/gemini/tensor/...]: A...
## 📌 Checklist before creating the PR - [ ] I have created an issue for this PR for traceability - [x] The title follows the standard format: `[doc/gemini/tensor/...]: A...
1. Support another batch dimension for softmax. In training or batch inference, we may add a batch dimension as the first dimension of some tensors. However, we use the third...