Vipul Sharma comments

Repositories
Issues
Comments

Results 4 comments of


                                            Vipul Sharma

Failed to loading frames/audio

@parth1497 I'm working on at least duplicating the results of this project since you are also working on it, maybe we can solve each other's doubts. There seem to be...

Add Codes to the material for reference

i've added pull request #68 showing example code for inheritance of types single, multiple and multi-level. Please accept the pull request.

float8 inference weight-only quant should map to a fused kernel or explain why not

Hi, I ran torchao's llama benchmark script with `Float8WeightOnlyConfig` on RTX 5090, and there's also abnormally high peak memory usage with this config, along with the initially reported slowdown: ```...

float8 inference weight-only quant should map to a fused kernel or explain why not

Hi, if it helps, I did some benchmarking for RCA. Although it's incomplete currently, it might give a better picture. The slowdown exists for torchao 0.13 but not for 0.14....