how to run on multi-gpu with device_map='auto'

Open qianwangn opened this issue 1 year ago • 1 comments

when I use 34B llm model, single-gpu will report OOM. so I set device_map='auto', but It seems cant use torchrun, It takes too much time to inference. how to solve this problem?

Sep 21 '24 03:09 qianwangn

Hi, which VLM are you using?

Sep 23 '24 12:09 kennymckormick

Close the issue due to no response in weeks, plz reopen if needed

Oct 08 '24 07:10 kennymckormick

I use yi-34B LLM. It seems visual encoder cost too much time.（32 frames per video）

Oct 18 '24 07:10 qianwangn