dlnlpchenliyu
dlnlpchenliyu
我也有同样的问题。 看论文里面,并没有提到non_neg_mask。看tfrecord.py和tfrecord.sh,labels里的数值都是大于等于0的, non_neg_mask = tf.fill(tf.shape(labels), -1.0, name='non_neg') non_neg_mask = tf.cast(tf.not_equal(labels, non_neg_mask), tf.float32) train.py里这两行代码执行后non_neg_mask里的指应该都是1.0。那么non_neg_mask的作用是啥?
> The best finetune result(finetune from the pretrain model you published) I get is 56.62,83.96,90.56 which is still 1.6 lower than your reported result, furthermore, the zero-shot evaluation result from...
> > Assuming we have the following: > > ``` > > CHECKPOINT=/path/to/fsdp_sharded_checkpoint/checkpoint_last > > CONSOLIDATED=/path/to/new_consolidated_checkpoint/ > > RESHARDED=/path/to/new_resharded_checkpoint/ > > MP=16 > > ``` > > > > >...