ZiWei Yuan

Results 50 comments of ZiWei Yuan

为什么现在看到的ch1已经是实现完成的代码了?

> /root/ktransformers/csrc/balance_serve/build May you enter the balance_serve dir to build and see the output? you can `cd` to it and use `rm -rf build` and `cmake -B build` and finally...

> hello, I have the same problem and the installation failed, what can I do? @qiyuxinlin seems like the version of flash_infer used is wrong (mismatch)?

You are welcome to report new models that are supported in the kt_kernel with your snapshot to prove

Do you have avx512 support? Seems our kernel is for amx/avx512. So you may use `lscpu` and check if avx512 is available. But we are writing support for more general...

> > Do you have avx512 support? Seems our kernel is for amx/avx512. So you may use `lscpu` and check if avx512 is available. But we are writing support for...

I am working on fixing this. I have tested the different sets: ``` export CPUINFER_ENABLE_AMX=OFF export CPUINFER_CPU_INSTRUCT=AVX512 ``` ``` export CPUINFER_CPU_INSTRUCT=AVX2 export CPUINFER_ENABLE_AMX=OFF ``` ``` export CPUINFER_ENABLE_AMX=ON ``` ``` export...

Please add some description to this PR.

> https://docs.vllm.ai/en/v0.8.3/getting_started/installation/cpu.html > > > Supported features > > vLLM CPU backend supports the following vLLM features: > > Tensor Parallel > > Model Quantization (INT8 W8A8, AWQ, GPTQ) >...