tiny-llm icon indicating copy to clipboard operation
tiny-llm copied to clipboard

feat: implement quantized_matmul with typed CPU implementation, supporting for dispatching different precisions

Open Elubrazione opened this issue 3 months ago • 0 comments

  • Add complete quantized_matmul_impl_typed template function for CPU (float16, float32, and bfloat16).
  • Add fp32 test cases for quantized_matmul.
  • Relax float32 tolerance in test utils.

Elubrazione avatar Oct 18 '25 12:10 Elubrazione