WeiMa01 issues

Results 4 issues of


                                            WeiMa01

[BUG] error: use of undeclared identifier 'double2half'; did you mean 'double2hiint'?"

When I launch the script with deepspeed ,there is a error " error: use of undeclared identifier __double2half; did you mean __double2hiint?" ##1. The script as following: ` import torch...

bug

inference

A question about LLaMA-2-7B and Mistral models only provide Dense-only (0%) quantized models

Hello, I have a question that I hope can be answered. Why do LLaMA-2-7B and Mistral models only provide Dense-only (0%) quantized models, but not 0.05% Sparsity and 0.45% sparsity...

which clang source of bf16 GEMM kernel

Which source path called for BF16 GEMM kernel(vbfdotq_f32)? 1. system clang path: /usr/lib/llvm-14/lib/clang/14.0.0/include/arm_neon.h 2. clang of android ndk path: /android-ndk-r27c/toolchains/llvm/prebuilt/linux-x86_64/lib/clang/18/include/arm_neon.h

[Request] Can't add new parameters to the kernel generated by 6x16-aarch64-neonfp16arith-cortex-a75.S.in

We try to add 3 parameters to the 6x16-aarch64-neonfp16arith-cortex-a75.S.in script, and then the f16-gemm-6x16-minmax-asm-aarch64-neonfp16arith-cortex-a75.S kernel can be modified. There are 3 parameters: - size_t index, - size_t tile, - void*...

WeiMa01

[BUG] error: use of undeclared identifier '__double2half'; did you mean '__double2hiint'?"

A question about LLaMA-2-7B and Mistral models only provide Dense-only (0%) quantized models

which clang source of bf16 GEMM kernel

[Request] Can't add new parameters to the kernel generated by 6x16-aarch64-neonfp16arith-cortex-a75.S.in

[BUG] error: use of undeclared identifier 'double2half'; did you mean 'double2hiint'?"