Ge Yuan
Ge Yuan
> your code contains mixed spaces and tabs, it's easy to fix... if you use python2.7, there would be no bug.
[Intriguing Properties of Vision Transformers](https://arxiv.org/abs/2105.10497),之前看到的是这篇论文。 另外想问一下你们做遮挡人脸实验的时候,训练时没有加遮挡,只有测试的时候才会加遮挡对吧?
另外还想请教一下, (1) 使用adamw优化器的时候是如何找到合适的学习率的? (2) 我的做法是训练8000step, 然后看哪种学习率设置在LFW,CFP-FP测试集的准确率最高. 因为资源受限, 没办法全部训练完比较最后的准确率, 这种寻找学习率的方式合理吗?
Downgrading to `gcc-9`, `g++-9`, and `cuda-11.3` works for me.
> @ygtxr1997 can you share ? I have some ideas on how to get training working by reworking the architecture a bit. Basically replacing the attention layers. Sorry, I cannot...
@mahicool `Hugging face space` seems to be used by many people currently. You can copy its `hugging face space` to your own private space with GPU or clone its `code`...
You may use this command to check whether the version between `cuda used to build pytorch` and `runtime cuda` is consistent: ```shell python -m torch.utils.collect_env ``` Output (11.7 != 11.3,...
> better one, close to make-money level (if clean garbage): > >  Amazing results! Can you share your training input video of this lion toy?
> It is total number of GPUs, we then reduce it by `num_machines`. (That SLURM example looks to be wrong possibly) Given 4 nodes and 8 GPUs per node, do...
> I'm stating the launcher will reduce it. `--num_processes` is the _total_ number of GPUs and assumes each node has the same number of GPUs on each. So rather than...