UM-MAE
UM-MAE copied to clipboard
Official Codes for "Uniform Masking: Enabling MAE Pre-training for Pyramid-based Vision Transformers with Locality"
I use upernet_mae_swin_tiny_256_mask...py, but when I use checkpoint-99-model.pth as the backbone pretrained model. It reports this: mmseg - WARNING - The model and loaded state dict do not match exactly...
Sorry to bother you again. I have downloaded your fine-tuning weight and tested the det part, but the following errors appear in the display RuntimeError: GFL: FPN: Default process group...
Hi @implus Thanks for the great work and released code! I have checked the ```configs``` in both ```DET``` and ```SEG```, but found there are no configs for ```ViT```, which is...
For fast training, I want to use fp16 training method for swin tiny mae. If you have done experiment with fp16, I want to know your configuration, such as grad...
Thank you for the code! I'm pertraining Swin Tiny on my dataset (around 799K training samples). The train-loss is slowly decreasing, reached around 0.115 at epoch 100. Can you please...
Hi,@implus, could you kindly provide the full checkpoint (including the decoder) of Swin-T and PVT-S? Lots of thanks!
Why is the visible patch simply concatenated with the invisible mask in the decoder, rather than being placed at specific positions? There seems to be a problem with this section...
In the MAE paper, kaiming He said, if Vit-mae sees mask token when training, and can't see mask token when testing, the inconsistency exists, which results a bad accuracy. But...
Hi~Thanks for your code. But in [this line](https://github.com/implus/UM-MAE/blob/main/mask_transform.py#L69) it may be a bug. I try your code in my tasks, and the image shape is [382,382]. But args.token_size=16 means self.num_patches=256,...