lqf0624

Results 3 comments of lqf0624

Where are files for implementation of gateway?

> Any tips on how to reduce VRAM requirements? > > I'm training the 2.8B Mamba and I'm oom on 16k context on an A100 80GB. Batch size of 1....

> any solution? Seems like a bug exists in transformers, when importing transformers, it will replace CUDA_VISIBLE_DEVICES that set by myself. I modified train script from `accelerate launch CUDA_VISIBLE_DEVICES=0,1,2` to...