Open-Sora icon indicating copy to clipboard operation
Open-Sora copied to clipboard

Question about torch.distributed.elastic.multiprocessing.errors

Open AnjouPry opened this issue 1 year ago • 3 comments

image

AnjouPry avatar Mar 20 '24 07:03 AnjouPry

I have the same problem

fenglincong avatar Mar 21 '24 11:03 fenglincong

It could happen due to the mismatch of your cuda version and the version of cuda that your installed pytorch supports. Please run nvcc --version and python -c "import torch; print(torch.__version__); print(torch.version.cuda)" to check if they match. If they already match, you may update the LD_LIBRARY_PATH to include cuda libs by export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH. Otherwise, please reinstall pytorch via conda install pytorch torchvision torchaudio pytorch-cuda=YOUR_CUDA_VERSION_HERE -c pytorch -c nvidia.

JThh avatar Mar 22 '24 11:03 JThh

This issue is stale because it has been open for 7 days with no activity.

github-actions[bot] avatar Mar 30 '24 01:03 github-actions[bot]

This issue was closed because it has been inactive for 7 days since being marked as stale.

github-actions[bot] avatar May 02 '24 01:05 github-actions[bot]