MOSS icon indicating copy to clipboard operation
MOSS copied to clipboard

CUDA_LAUNCH_BLOCKING=1、TORCH_USE_CUDA_DSA

Open lhtpluto opened this issue 2 years ago • 4 comments

bash run.sh finetune_moss.py 出现异常 RuntimeError: CUDA error: out of memory CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

cuda pytorch版本: 各种都测试了,报错就是不变

lhtpluto avatar Jun 06 '23 15:06 lhtpluto

确认是WSL的内存设置问题

问题已经解决

lhtpluto avatar Jun 08 '23 03:06 lhtpluto

How did you change this WSL? @lhtpluto

tonycbcd avatar Jun 13 '23 08:06 tonycbcd

How did you change this WSL? @lhtpluto

英文不好看不懂

lhtpluto avatar Jun 13 '23 10:06 lhtpluto

怎么解决WSL内存设置问题的,应该怎么操作@lhtpluto

在C:\Users \ <用户名>\ 下新建.wslconfig

.wslconfig内容例子: [wsl2] memory=480GB swap=32GB processors=56 localhostForwarding=true

======================== 需要注意的是,WSL 貌似仅支持64线程,而DEEPSPEED又不支持超线程,因此使用W9-3495X时,需要在BIOS中关闭超线程

lhtpluto avatar Jun 16 '23 01:06 lhtpluto