WeiMa01

Results 4 issues of WeiMa01

When I launch the script with deepspeed ,there is a error " error: use of undeclared identifier __double2half; did you mean __double2hiint?" ##1. The script as following: ` import torch...

bug
inference

Hello, I have a question that I hope can be answered. Why do LLaMA-2-7B and Mistral models only provide Dense-only (0%) quantized models, but not 0.05% Sparsity and 0.45% sparsity...

Which source path called for BF16 GEMM kernel(vbfdotq_f32)? 1. system clang path: /usr/lib/llvm-14/lib/clang/14.0.0/include/arm_neon.h 2. clang of android ndk path: /android-ndk-r27c/toolchains/llvm/prebuilt/linux-x86_64/lib/clang/18/include/arm_neon.h

We try to add 3 parameters to the 6x16-aarch64-neonfp16arith-cortex-a75.S.in script, and then the f16-gemm-6x16-minmax-asm-aarch64-neonfp16arith-cortex-a75.S kernel can be modified. There are 3 parameters: - size_t index, - size_t tile, - void*...