CheerM issues

Results 5 issues of


                                            CheerM

Failed to reproduce the performance reported in the Table6 (Swin V2)

Hi there, It's so exciting to know the post-norm can stabilize the training process. We tried to reimpl the swint w/ post-norm, just like reported in Table6 81.6 top1-acc in...

How is the reproduced performance?

Hi, it's a great and concise reimplementation of MLP works. Meanwhile, I'm wondering how is the performance of reimplemented version compares to the reported performance of the manuscripts? It would...

gained 0.094 for eval_top1, and 0.33 for eval_top5, after 36-epoch training on 8 gpus

hi, would you mind releasing the training log for T2t-vit-t-14 training with 8 GPUs? I tried to rerun the script for training T2t-vit-t-14 with 8 GPUs. It gained 0.094 for...

Failed to reimplement the exps on iwslt'14 de-en

Hi, I got some issues with the reimplementation of models trained on iwslt'14 de-en. The hyper-parameters of DeLighT(d_m=512) were set as https://github.com/pytorch/fairseq/blob/master/examples/translation/README.md, like `CUDA_VISIBLE_DEVICES=0 fairseq-train \ data-bin/iwslt14.tokenized.de-en \ --arch transformer_iwslt_de_en...

How is the reproduced performance?

Hi, just wondering how is the reimplemented performance for dense/random/hybrid synthesizers?