Tianwei She

Results 4 comments of Tianwei She

thanks for the reply! I'm using an AWS [g5.48xlarge instance](https://aws.amazon.com/ec2/instance-types/g5/) which has 192GiB GPU memory

Thanks for replying! @younesbelkada I printed out the `device_map`, there are indeed some modules not on GPU - `'transformer.h.69': 'disk', 'transformer.ln_f': 'disk'` ``` {'transformer.word_embeddings': 0, 'lm_head': 0, 'transformer.word_embeddings_layernorm': 0, 'transformer.h.0':...

@TimDettmers btw I also tried tuning the parameter `int8_threshold`, with `int8_threshold = 0`, the memory usage is the same as the default `int8_threshold = 0.6`. Just wanted to confirm, is...

@mrwyattii Hi I'm having CUDA OOM errors when loading a `EleutherAI/gpt-neox-20b` model onto 8 GPUs with TP=8, FP16. Each GPU has 23GB. Is this expected? and does this mean I...