HDCharles

Results 29 comments of HDCharles

Yeah, I will get started on this

@wconstab for the quantization benchmarking I'm wondering what a desirable 'scope' would be. The most natural type of quantization to benchmark is QAT, which has both training and evaluation. Then...

Well here is an initial PR, https://github.com/pytorch/benchmark/pull/323 This one is doing C and A.

you should use the nightly version of torch or at least the recent 2.2 branch cut, its a newish op that was added for int4 support.

looks like @ftian1 @holly1238 @yqhu wrote/landed the tutorial, can one of you guys take a look at this? The pytorch quantization oncall is listed for this issue but the tutorial...

the quantization overhead is to blame, at least for the numbers in the README. You're doing the same amount of computation in the matmul but also have to decompress the...

> One question about this @HDCharles. The SpinQuant repo has a dependency on the [CUDA fast Hadamard transform](https://github.com/Dao-AILab/fast-hadamard-transform) package for doing the actual Hadamard transform. Would it be acceptable to...

this shouldn't be in generate.py, it should be in eval so we can actually see the accuracy impact

we don't really have lm_eval as a dependency so i don't know if pinning it is really the solution here. if you wanted to submit a PR getting rid of...

Hey this is looking nice so far, long term we probably want to make these tensor subclasses so that we can make serialization easier. that way rather than having to...