Sergey 'Jin' Bostandzhyan

Results 94 comments of Sergey 'Jin' Bostandzhyan

> Delete xformers word in this file requirements.txt save file and run pip install -r requirements.txt again The issue is, that if `xformers` is not installed at all, there will...

Workaround for ROCm users: so, apparently `xformers` need to be installed even if they are not being used due to the hard import from the trace above. I had to...

I can't tell if this is related to your error, but keep in mind that `--device` does not work, it's hardcoded to `cuda:3` in the sources: https://github.com/gaomingqi/Track-Anything/blob/e6e159273790974e04eeea6673f1f93c035005fc/app.py#L381C16-L381C21

I also see a 100% CPU hog and hanging, the backtraces always lead via `roc::DmaBlitManager::hsaCopyStaged ` ending up in `rocr::core::InterruptSignal::WaitRelaxed` I can reproduce it with the ROCm Validation Suite with...

I updated to kernel `6.8.0-0.rc4.20240215git8d3dea210042.38.fc41.x86_64` and now the `rvs` test fails with a coredump: ``` Running kernels 5000 times Precision: double Array size: 268.4 MB (=0.3 GB) Total size: 805.3...

I kept poking at this, tried out the examples from the HIP-examples repo and basically I can see that any code that is taking the path via `roc::DmaBlitManager::hsaCopyStaged` ends up...

I installed Fedora 39 in a docker container (host is running Rawhide/Fedora 40) and checked with ROCm 5.7.1, the problem is **not** present there, both, the validation suite as well...

@ppanchad-amd OK, but it will take a while since I will have to compile ROCm 6.1.1 myself, back then 5.7.1 and 6.0.0 were packaged in Fedora 39 and Rawhide pre-40,...

@ppanchad-amd I was lucky, ROCm 6.1.1 has recently made it into Rawhide 41, so did not have to recompile anything myself, but was able to retest with a docker image....

@taweili I think the issue might be exactly what I have been trying to debug here https://github.com/ROCm/ROCm/issues/2715 If you need a solution "now" try downgrading to ROCm 5.7.1, I was...