bene-ges comments

Results 13 comments of


                                            bene-ges

Problem with fine tuning fastconformer_hybrid_large_streaming_multi model on another language

@traidn, maybe this is not an official way, but you can try ``` state_dict = torch.load("model_weights.ckpt", map_location=device) asr_model.load_state_dict(state_dict) ``` you can get `model_weights.ckpt` if you unpack the nemo checkpoint with...

Problem with fine tuning fastconformer_hybrid_large_streaming_multi model on another language

@traidn - maybe you can try to load weights and then train like you did from scratch? Without resuming from checkpoint

RuntimeError: Failed to find native CUDA module

Hi, we had the same error after the successful building fast_rnnt for AMD using Rocm 5.4 with correct installed pytorch 2.0.1 and torchaudio 0.15.2 ``` File "/home/ubnt/anaconda3/lib/python3.8/site-packages/fast_rnnt/rnnt_loss.py", line 533, in...

RuntimeError: Failed to find native CUDA module

It seems that Rocm isn't supported in the build. `-- No NVCC detected. Disable CUDA support`

RuntimeError: Failed to find native CUDA module

@danpovey, rocm can compile CUDA code into the amd binary. Most of projects just add the rocm compile commands like Pytorch does. So the Pytorch build system can be an...

RuntimeError: Failed to find native CUDA module

another useful link on porting CUDA (all notations almost identical) https://www.lumi-supercomputer.eu/preparing-codes-for-lumi-converting-cuda-applications-to-hip/

RuntimeError: Failed to find native CUDA module

I can help with testing on amd if needed

fix bug in set_device_map for trpc

@chaoyanghe would you merge this PR? It's a simple bug fix

[Bug] Rope doesn't work for llama-3

i see this issue says RoPE is supported https://github.com/mlc-ai/mlc-llm/issues/1344 maybe there is way to turn it on? I didn't found in options

Eval bug: inference of 32B eats too much memory on ROCM HIP (5x AMD Radeon Instinct Mi50 (gfx906))

Tried to build llama.cpp with Vulcan, on the same 5x AMD Radeon Instinct Mi50 the same inference command works fine but 2x slower, the VRAM doesn't grow at prompt reading.