Xiao

Results 25 issues of Xiao

### 🐛 Describe the bug https://github.com/hpcaitech/ColossalAI-Examples/issues/196 https://github.com/hpcaitech/ColossalAI-Examples/issues/195 https://github.com/hpcaitech/ColossalAI-Examples/issues/193 ### Environment _No response_

bug

### 🐛 Describe the bug I use the command to use the synthetic data to run the code. And it meets problem. ` torchrun --nproc_per_node=4 train.py --synthetic 2>&1 | tee...

bug

### 🐛 Describe the bug I change my config like below, and it runs on 1GPU. The global batch size = 1, ``` from colossalai.amp import AMP_TYPE DATA_PATH = '/data/v-xxshi/coloss/raw_data/my-bert_text_sentence'...

bug

I setup my azure account. And I try to "bash tests/run_smoke_tests.sh test_minimal". And there is some error. So, if I want to contribute to this project, the first thing I...

My deepspeed version is 0.8.1 , my torch version is 1.13.1 and my transformer version is transformers==4.21.2. My CPU memory is 500GB I follow the [document](https://github.com/microsoft/DeepSpeedExamples/tree/master/inference/huggingface/text-generation) to run my code....

bug
inference

- my transformer is 4.21.2 and my deepspeed is 0.8.1. And I run the code [bloom-ds-zero-inference.py](https://github.com/huggingface/transformers-bloom-inference/blob/main/bloom-inference-scripts/bloom-ds-zero-inference.py) on two node with the following command ``` torchrun --node_rank=0 --nnodes 2 --nproc_per_node=8 --master_addr...

bug
inference

### Branch/Tag/Commit 9b6d718b52f10f08a810c0885e070789e462102b ### Docker Image Version nvcr.io/nvidia/pytorch:22.09-py3 ### GPU name V100 ### CUDA Driver Driver Version: 510.73.08 ### Reproduced Steps ```shell 1 I use the script to convert my...

bug

### 📚 The doc issue It seems the pipeline parallel document is out of date(https://www.colossalai.org/docs/features/pipeline_parallel). When I try to use from colossalai.builder import build_pipeline_model, it seems it's none

### 🐛 Describe the bug I try to run a config by using the [train_gpt.py](https://github.com/hpcaitech/ColossalAI-Examples/blob/main/language/gpt/train_gpt.py). I add a model on the [gpt.py](https://github.com/hpcaitech/Titans/blob/main/titans/model/gpt/gpt.py) . ``` def gpt2_test4gpu350M(**kwargs): model_kwargs = dict(hidden_size=1024, depth=24,...

### 🐛 Describe the bug when I try to run the code [pipeline_gpt1d.py](https://github.com/hpcaitech/ColossalAI-Examples/blob/main/language/gpt/model/pipeline_gpt1d.p), there is a error, where I can't find the [import model_zoo.gpt.gpt as col_gpt](https://github.com/hpcaitech/ColossalAI-Examples/blob/main/language/gpt/model/pipeline_gpt1d.py#L12). So, what's the model_zoo.?...