Results 13 comments of bene-ges

@traidn, maybe this is not an official way, but you can try ``` state_dict = torch.load("model_weights.ckpt", map_location=device) asr_model.load_state_dict(state_dict) ``` you can get `model_weights.ckpt` if you unpack the nemo checkpoint with...

@traidn - maybe you can try to load weights and then train like you did from scratch? Without resuming from checkpoint

Hi, we had the same error after the successful building fast_rnnt for AMD using Rocm 5.4 with correct installed pytorch 2.0.1 and torchaudio 0.15.2 ``` File "/home/ubnt/anaconda3/lib/python3.8/site-packages/fast_rnnt/rnnt_loss.py", line 533, in...

It seems that Rocm isn't supported in the build. `-- No NVCC detected. Disable CUDA support`

@danpovey, rocm can compile CUDA code into the amd binary. Most of projects just add the rocm compile commands like Pytorch does. So the Pytorch build system can be an...

another useful link on porting CUDA (all notations almost identical) https://www.lumi-supercomputer.eu/preparing-codes-for-lumi-converting-cuda-applications-to-hip/

I can help with testing on amd if needed

@chaoyanghe would you merge this PR? It's a simple bug fix

i see this issue says RoPE is supported https://github.com/mlc-ai/mlc-llm/issues/1344 maybe there is way to turn it on? I didn't found in options

Tried to build llama.cpp with Vulcan, on the same 5x AMD Radeon Instinct Mi50 the same inference command works fine but 2x slower, the VRAM doesn't grow at prompt reading.