Barry Kang
Barry Kang
Sorry for the late reply, could you please try later versions? @fedem96 @ChristianPala This bug is fixed in later versions. Thanks!
@YihengBrianWu could you please provide more details about the model you used? This can help on reproducing this issue, thanks.
Thanks for the information. Let me try to reproduce and investigate on this.
Hi @YihengBrianWu, could you please try the tests on the latest main branch as I ran it successfully on V100s, thanks!
I tried the latest main with pynvml == 11.4.0/11.4.1/11.5.0, and all these versions work fine in the conversion on T4. Could you please try clean build and share more details...
Hi @ChristianPala @gloritygithub11, `int4_awq` for MoE is still not implemented yet, we are working on the development and will update the status here it once it's ready. Thanks for your...
Hi @gloritygithub11, we are still working on this. Thanks for your patience!
Yes, it's still on the roadmap. We are working on this.
@ttim you can use INT4 weights & FP8 activations with the following steps: 1. Update your local repo to the latest main and build with `--cuda_architectures "90-real"`. 2. Follow [INT4...
@felixslu This error appears when the generated checkpoints' names are not converted correctly. This is fixed in the latest main branch.