Swin-Transformer icon indicating copy to clipboard operation
Swin-Transformer copied to clipboard

Clarification on Speed Improvement with `fused_window_process` and Its Necessity for Small-Scale Tasks

Open Fanqyu opened this issue 1 year ago • 1 comments

Hi, thank you for your excellent work!

I have a question regarding the fused_window_process. With the integration of the window process in the CUDA files, is the speed improvement significant? Could you provide some quantitative data to illustrate the performance gains?

Additionally, for tasks of a smaller scale, is it necessary to utilize the window process, or would it be better to use a default implementation of torch.roll?

Looking forward to your response!

Fanqyu avatar Oct 16 '24 09:10 Fanqyu

Hi @Fanqyu,

I'm a little late to answer your question, but for people who might be wondering the same thing, here is the original pull request which provides some information: https://github.com/microsoft/Swin-Transformer/pull/233.

Asers387 avatar Feb 11 '25 15:02 Asers387