supriyar
supriyar
cc @drisspg @jainapurva
thanks for reporting this @felipemello1! looks like some culprits are `torchao.float8.float8_utils`: ~437ms individual time and `torchao.quantization.autoquant`: ~220ms individual time and ~999ms cumulative Also noticed that the float8 related modules have...
cc @andrewor14
cc @danielvegamyhre @vkuzo
@gau-nernst any thoughts on what might be the issue?
cc @andrewor14
cc @gau-nernst is this something you can take a look at?