ByteTransformer
ByteTransformer copied to clipboard
optimized BERT transformer inference on NVIDIA GPU. https://arxiv.org/abs/2210.03052
I have met a memory leak problem with 3090, the memory increases slowly, but always increase.
As I was reading this article I noticed that the TIME breakdown in Figure 3 is very accurate, I was wondering what tool you used to complete the time measurements?
Can I use ByteTransformer to train TransFormer models on GPUs , currently supports which models ?
Thx for the marvelous work! [Lightseq](https://github.com/bytedance/lightseq) is also a transformer inference speedup library. I wonder if there is any performance comparison between ByteTransformer and Lightseq?
Hi there! Do you have in mind adding this kind of improvements for video models / methods? Not "only" BERT, it would be amazing! PS: Thank you for sharing to...