eliotwang
Results
2
issues of
eliotwang
## Proposed changes Added an example of bf16*fp4 gemm, where fp4 and fp4_scale are in uint8 data format. In the pipeline, matrix B(fp4) will be dequantized to bf16 before performing...
## Proposed changes Added an example of bf16 * FP4 GEMM with bias and SwiGLU activation. In this implementation, both FP4 weights and FP4 scaling factors are stored in uint8...