functionstackx
functionstackx
hi fbusato, thanks for your suggestion. 1. I believe i am already running the autotuning function `cusparseLtMatmulSearch`. is there another function that I am missing? https://github.com/OrenLeung/CUDALibrarySamples/blob/e3cfb07e6b6625ec33b8526d82bebd5a21185624/cuSPARSELt/matmul/matmul_example.cpp#L348 2. i have already...
It seems when changing the inputs to a normal distribution centered around 0, then the sparse performance gets a bit better with 20% improvement over dense. https://github.com/OrenLeung/CUDALibrarySamples/commit/9cabba4b1154f2c49037d89171d41c31b6033c79 ``` # median...
@fbusato thanks for running it. by "800W h100", you mean 700W right? we also see around 1.20-1.22x improvement too. Would you have any suggestions on shapes where sparsity would show...
@hliuca by internally, do u mean in an closed source package? is there any way to gain access to that or is there any way that it could be open...
@pytorchbot label ciflow/rocm
@jeffdaily all of the failures seem unrelated
@qcolombet @araslanix
https://github.com/sgl-project/sglang/blob/ddd1440d0f027e85af6be53bbb309683ed7ea2c4/.github/workflows/nightly-test.yml#L49-L64
@qcolombet yes, we are looking into it @cquil11 is just trying to land an massive refactor PR first to reduce tech debt and then we can look into this one