SegFormer icon indicating copy to clipboard operation
SegFormer copied to clipboard

About the PatchEmbedding

Open SimonServant opened this issue 3 years ago • 2 comments

Hello dear author,

in the paper to the SegFormer work it was referenced, that SegFormer starts with "first dividing input into patches of size 4 × 4, which is smaller than the original ViT 16x16". However the patch embedding parameter, unlike in the PVT code, is not utilized. Instead the embedding dimensions correspond to the 7,3,3,3, which you mentioned in the paper. However, this does mean that the initial embedding is not 4x4 patches, even if they are still smaller than the original 16x16 patches. I wanted to ask if i misunderstood the paper or the code or if this is a small oversight on your side ?

Thank you very much.

SimonServant avatar Aug 15 '22 18:08 SimonServant

I am very sorry, i miscalculated the padding, therefore the result is the same as when using the kernel size 4. However i still am not able to see the patch of size 4x4x3 named in the paper.

SimonServant avatar Aug 15 '22 18:08 SimonServant