Compact-Transformers issues

Question about the batch size

Hi, this work is awesome. I just have one little question. The paper says the total batch size is 128 for CIFAR's and 4 GPU's were used in parallel. That...

imhgchoi

Output of the CCT classifier

Hi, i am a little confuse about the output of the CCT. If I have a classification task with n possible classes, are the outputs the logits of each class?...

enrico310786

Fixed text tokenizer mask shape

Hi, There was a small problem with the mask returned from TextTokenizer forward function. The next function using this mask needs a 2D tensor. Therefore, in TextTokenizer, the mask should...

HosseinZaredar

change TextTokenizer 2DConvolution to 1D

1

Hello, It seems to make more intuitive sense to use 1D convolutions here over the embedding with a channel size equal to the word embedding dimension, rather than the edge-case...

simonlevine

Order of `LayerNorm` & `Residual`

1

First of all, thanks for your amazing work! And it seems that your `TransformerEncoderLayer` implementation is a bit different from the 'mainstream' implementations, because you create your residual link **after**...

carefree0910

Trouble with model function call in examples/main.py for CIFAR10

4

It seems the models function call in examples/main.py is failing with the error message: ``` Traceback (most recent call last): File "/mnt/code/Compact_Transformers/examples/main.py", line 279, in main() File "/mnt/code/Compact_Transformers/examples/main.py", line 127,...

BKJackson

bug

can you share more NLP-related scripts？

3

This is incredibly exciting! Thank you so much. I'm interested in exploring this with NLP. Unfortunately, I'm running into some issues that seem to be related to expected tensor sizes...

TjFish

Need help

1

Firstly, thank you very much for your work. But when I used your open source code to classify my dataset images, the accuracy was not ideal, perhaps because the model...

xhlho