Sushil Singh comments

Results 5 comments of


                                            Sushil Singh

Trax ML: GPU memory allocated but completed on CPU

Same problem for me, please post if you found a solution.

Trax ML: GPU memory allocated but completed on CPU

setting "trax.fastmath.set_backend('tensorflow-numpy')" seems to help, I can see the gpu cycles being used.

Nvidia cuda samples fail to run with SCUDA due to lack of cuda-elf parsing.

Looks like the kernel function was not parsed properly when __cudaRegisterFatBinary was called so the client code failed on kernel launch call.

Nvidia cuda samples fail to run with SCUDA due to lack of cuda-elf parsing.

Debugged it further, looks like the problem is with __cudaRegisterFatBinary, the cuda samples form nvidia are by default stored as ELF, and require further processing to extract the cuda-kernel details...

Nvidia cuda samples fail to run with SCUDA due to lack of cuda-elf parsing.

[vectorAdd.build_with_keep.tar.gz](https://github.com/user-attachments/files/19277085/vectorAdd.build_with_keep.tar.gz) These are the artifact generated when build with "--keep"