Wuqiman comments

Results 6 comments of


                                            Wuqiman

训练问题

请问可以多卡训练吗？我尝试了一下，好像只能单卡训练？

求救！这个框架支持多卡训练吗，为什么我用服务器训练只能用一个卡啊，需要怎么设置啊，求告知！感激不尽！

> 目前这样设置， CUDA_VISIBLE_DEVICES=0, 1, 2, 3 python train.py，发现只是卡0上占用显存，这是什么问题呢？@YunYang1994

how to convert the pytorch to onnx by adding a new attribute of conv?

Hello~ I want to add a quantization bits parameter to conv.What should i do? Thank you very much!

how to convert the pytorch to onnx by adding a new attribute of conv?

> can you elaborate a little bit more on the scenario please? what's the "quantization bits" used for? Currently, there's no specific "designed" place to store such parameter in onnx...

set fp32 for MatMul failure of TensorRT 8.6.1.1 when converting onnx to trt with version 8.6.1.1

![image](https://github.com/user-attachments/assets/a5b4d166-7cac-467d-a9ee-10609bb6e3e0) For MHA precision, we want Matmul cast to fp32，which used to fp16. However, it doesn't work.

Set layernorm to fp32 failure of TensorRT v8.6.11 when running trtexec on GPU A100

Using our layernorm fp32 plugin, the precision of model is normal. Using fp16 myelin, the precision drops more than 20%+.