Sekri0

Results 8 comments of Sekri0

> We are developing a complete link from pseudo-quantized models to real packing weights and directly executing WxAy quantized inference in Torch, which is expected to be released within a...

Thanks for the reply, I have one more question. In the end-to-end experiment, which kernel is used in the prefill phase of the w2a8 model

我补充了详细的环境信息。我目前装的FunASR 1.2.0似乎已经是最新的,报错问题仍然存在

从源码安装最新的funasr 1.2.1后报错消失,非常感谢

@zachzzc @raywanb Sorry to bother you guys, could you please take a look at this problem?

> Can you provide a minimum script to reproduce your problem ? @Sekri0 Sorry for not replying in time, this issue occurs midway through the inference service, so I'm not...

> > > Can you provide a minimum script to reproduce your problem ? @Sekri0 > > > > > > Sorry for not replying in time, this issue occurs...

> [@liweiqing1997](https://github.com/liweiqing1997) Totally understand. We will try to quant and fix this by next week. The bug is most likely in vLLM change model parameter names based on your stack...