Qwen2-72B,16K长文本,convert转换为HF模型OOM
16K长文本已经训练好了,但是convert转换为HF模型发现OOM,我有六个卡,但是貌似只用到了一个。请问我该怎么改配置文件呢? torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 116.00 MiB. GPU 0 has a total capacity of 79.33 GiB of which 39.81 MiB is free. Process 1333184 has 79.28 GiB memory in use. Of the allocated memory 78.59 GiB is allocated by PyTorch, and 218.78 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)
我也遇到了这个问题,请问解决了吗
我也遇到了这个问题,请问解决了吗
已解决, 需要在config的model中的llm中增加一个device_map='auto'参数
请问你72b,16k长文本用的多少资源呢?8卡80g能行吗