justdoit
Results
13
comments of
justdoit
Polite, may I ask if the build_tree function does not have a Triton version implementation!
I am using the latest code for offline testing。 At the beginning of the cycle,tps = 180 tokens/s, But after a few cycles, there will be a serious decrease in...
how can I make v3/r1 gptq int8 weight. use func per_token_group_quant_int8 to produce int8 blockwise weights?