eliotwang issues

Repositories
Issues
Comments

Results 2 issues of


                                            eliotwang

Bf16*fp4 gemm

## Proposed changes Added an example of bf16*fp4 gemm, where fp4 and fp4_scale are in uint8 data format. In the pipeline, matrix B(fp4) will be dequantized to bf16 before performing...

Bf16* fp4 gemm with bias and swiglu

## Proposed changes Added an example of bf16 * FP4 GEMM with bias and SwiGLU activation. In this implementation, both FP4 weights and FP4 scaling factors are stored in uint8...