benchmark icon indicating copy to clipboard operation
benchmark copied to clipboard

[WIP] Use sync-free cuda event timing in benchmark

Open xwang233 opened this issue 3 years ago • 0 comments

This PR does not mean the final form of torchbench code changes. I think it's rather a discussion on how we should implement a sync-free cuda event timing mechanism.

This PR uses sync-free cuda event timing as suggested in https://github.com/pytorch/pytorch/issues/93767

See also https://github.com/pytorch/pytorch/issues/93767

I've created a snippet here to show this idea. https://gist.github.com/xwang233/f00433a7826f485858ff0eaa59b3bd59

xwang233 avatar Jun 28 '22 19:06 xwang233