pnunna93
pnunna93
@chauhang Could you try with rocm 6.0? You can use this docker - rocm/pytorch:rocm6.0.2_ubuntu22.04_py3.10_pytorch_2.1.2 and install bitsandbytes directly.
@chauhang, you can skip the hipblaslt update and install bitsandbytes directly then. Please let me know if you face any issues.
Yes, its updated for rocm 6
Hi @friedc, The issue is related to rocPRIM/hipCUB library, it should have been fixed in ROCm 6.1 (https://github.com/ROCm/rocPRIM/issues/452). Please try upgrading the libraries or use one of our latest dockers...
Hi @DavideRossi, Could you please share python and hip traces for the script? For python trace, you can add this before the line where script hangs. You can stop it...
@Titus-von-Koeller Sure, I have been ironing out some details and got most of the info I need. I will start testing and open a PR early next week.
Hi @garrettbyrd , Thanks for bringing it to our attention. There were lot of updates in the last few weeks which made those instructions obsolete. All the branches are updated...
@garrdbyrd , could you please build with this dockerfile and check? [bnb_rocm_dockerfile.txt](https://github.com/user-attachments/files/15570506/bnb_rocm_dockerfile.txt) Your environment may have multiple libstdc++.so.* files, which caused the issue.