linghan1997

Results 4 issues of linghan1997

Hi, I have some questions regarding CUDA usage when I call the GPipe class. When GPipe tries to split_module partition.to(device), the error torch._C._cuda_init() RuntimeError: No CUDA GPUs are available happens....

I wonder what's the label for _y = to_categorical(Y_profiling, num_classes=256)_ since Y contains multiple tensors within one sample? The error of course occurs: _TypeError: Cannot cast array data from dtype([('alpha_mask',...

Hi, I have a question about batch size in Distributed Data Parallel. In my understanding, GPUs calculate each loss on their own nodes separately, so why 64*64 gpus can equal...

Hi, thank you for your excellent work and the high-quality open-source releases. Since you have also open-sourced several datasets for both SFT and RL, such as miromind-M1 and miroverse-v0.1, I...