CUDA_LAUNCH_BLOCKING=1、TORCH_USE_CUDA_DSA

Open lhtpluto opened this issue 2 years ago • 4 comments

bash run.sh finetune_moss.py 出现异常 RuntimeError: CUDA error: out of memory CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

cuda pytorch版本：各种都测试了，报错就是不变

Jun 06 '23 15:06 lhtpluto

确认是WSL的内存设置问题

问题已经解决

Jun 08 '23 03:06 lhtpluto

How did you change this WSL? @lhtpluto

Jun 13 '23 08:06 tonycbcd

How did you change this WSL? @lhtpluto

英文不好看不懂

Jun 13 '23 10:06 lhtpluto

怎么解决WSL内存设置问题的，应该怎么操作@lhtpluto

在C:\Users \ <用户名>\ 下新建.wslconfig

.wslconfig内容例子： [wsl2] memory=480GB swap=32GB processors=56 localhostForwarding=true

======================== 需要注意的是，WSL 貌似仅支持64线程，而DEEPSPEED又不支持超线程，因此使用W9-3495X时，需要在BIOS中关闭超线程

Jun 16 '23 01:06 lhtpluto