BitNet icon indicating copy to clipboard operation
BitNet copied to clipboard

Extremely slow

Open seghier opened this issue 1 year ago • 3 comments

Extremely slow in CPU mode

seghier avatar Oct 21 '24 02:10 seghier

Could you please provide more details? Which command is extremely slow?

dawnmsg avatar Oct 21 '24 08:10 dawnmsg

Which compiler?

alexeyvolkoff avatar Oct 21 '24 16:10 alexeyvolkoff

Ubuntu 20.04 Clang-18

main: llama threadpool init, n_threads = 2

system_info: n_threads = 2 (n_threads_batch = 2) / 16 | AVX = 1 | AVX_VNNI = 0 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | AVX512_BF16 = 0 | FMA = 1 | NEON = 0 | SVE = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | RISCV_VECT = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | MATMUL_INT8 = 0 | LLAMAFILE = 1 |

sampler seed: 4294967295 sampler params: repeat_last_n = 64, repeat_penalty = 1.000, frequency_penalty = 0.000, presence_penalty = 0.000 top_k = 40, tfs_z = 1.000, top_p = 0.950, min_p = 0.050, typical_p = 1.000, temp = 0.000 mirostat = 0, mirostat_lr = 0.100, mirostat_ent = 5.000 sampler chain: logits -> logit-bias -> penalties -> greedy generate: n_ctx = 2048, n_batch = 1, n_predict = 6, n_keep = 1

Daniel went back to the the the garden. Mary travelled to the kitchen. Sandra journeyed to the kitchen. Sandra went to the hallway. John went to the bedroom. Mary went back to the garden. Where is Mary? Answer: Mary is in the garden.

llama_perf_sampler_print: sampling time = 1.56 ms / 54 runs ( 0.03 ms per token, 34526.85 tokens per second) llama_perf_context_print: load time = 1756.28 ms llama_perf_context_print: prompt eval time = 36718.06 ms / 48 tokens ( 764.96 ms per token, 1.31 tokens per second) llama_perf_context_print: eval time = 3840.11 ms / 5 runs ( 768.02 ms per token, 1.30 tokens per second) llama_perf_context_print: total time = 40564.05 ms / 53 tokens

sunzj avatar Oct 25 '24 10:10 sunzj

Please try with the latest model on HF. https://huggingface.co/microsoft/bitnet-b1.58-2B-4T-gguf

sd983527 avatar Apr 17 '25 07:04 sd983527