Yurui Zhu

Results 10 comments of Yurui Zhu

在https://gitee.com/mindspore/models/tree/master/research/cv/IPT 提及了 training 过程,这里的scale不应该是6嘛 (论文里用了6种数据),这里的2+3+4+1+1+1 是什么含义呢? python train_finetune.py --distribute --imagenet 0 --batch_size 64 --lr 2e-5 --scale 2+3+4+1+1+1 --model vtip --num_queries 6 --chop_new --num_layers 4 --task_id $TASK_ID --dir_data $DATA_PATH --pth_path $MODEL...

> > 第一阶段: Pre-training 第二阶段: Finetuning on the specific task > > 但在第一阶段时是要训练multi-heads,multi-tails; 训练时一个batch 只是随机选一种task 的pair 数据送入到model中,利用反向传播来更新相应的head,tail,和body;其中是不是需要设置,在训练A task时,其他 task 所对应的heads,tails是保持不变的(不会被更新的) > > 第二阶段:只保留相应的task的head 和tail,其他的heads和tails是直接丢弃的 > > 这个过程想确认一下 > > 是的...

怎么去多次输入进行对话呢?

Thank you for your detailed responses and for addressing my confusions. The idea of using masked operations to improve generalization ability is both interesting and practical. However, I still have...

I am still confused about the 'norm function' of the Neural Representation. The above answer seems still do not clarify the reason.

Thanks! I have read Sec 3.2 again. Is the 'norm function ' of NRN actually observed during the experiments? Moreover, can you explain the effect of hyperparameter _L_ to achieve...

Thanks! Actually, such a brightness-normalizing effect of NR is surprising and interesting.

> 找到了。正在测试。建议官方提供多卡推理教程。谢谢! 在哪找到的呢