林趺菩
林趺菩
 Hi, Can I ask in this plot of paper, the grid-like denotes kernel or the feature map? I think it should denotes feature map!
I want to ask that how to implement gradient accumulation on your work. Since my computing resource is single RTX4090 (24GB), so I'm not able to set batch size to...
Can I ask how do you implement gradient accumulation code in deit model training? Since I can not find other resources on the internet doing gradient accumulation on deit training,...
Awesome works! I want to ask that whether you have experiment your method under more extreme MACs drop ratio, such as the accuracy drop under pruning 80% of MACs. Thanks!
Excellent work !!! I want to ask whether the method is able to decide different prune FLOPs ratio. For example, I want to perform different level of pruning, such as...
Hi, I want to ask that is there any specific reason for the token keep ratio setting ? [1, 1, 1, 0.7, 0.7, 0.7, 0.7^2, 0.7^2, 0.7^2, 0.7^3, 0.7^3, 0.7^3]...
Sorry for so many questions. Another problem I encounter is the training result does not match the paper's result. For example, my training result of **Evo-ViT-S** is acc **66.22%**, which...
Excuse me, I want to ask that: (1)Does this method provide FLOPs drop ratio, cause in the paper you use throughput as criteria (2)How does the method work at extreme...
Hi, I want to ask that the behavior of training stage and inference stage. Since the paper illustrate the behavior of training stage, however, the inference stage did not mentioned....
Awesome work ! But I can not run the project correctly yet. Please provide me some information, thanks !