Danni (Danqing) Zhang

Results 51 comments of Danni (Danqing) Zhang

Hi Yaoyao, thanks for the reply! I see, I can report the numbers later here when I finish the experiments. For (2), so what you mentioned is the FT and...

@yaoyao-liu , then for "SS[Θ;θ], HT meta-batch" in table 2, is that also the pre-trained model without the first fine-tuning step? I mean which experiments in table 2 has the...

The differences between your proposed MTL algorithm, and the MAML-Resnet algorithm are: 1) fine-tuning; 2) HT and 3) FT->SS meta-training operations. If we want to claim "SS meta-training operations" works,...

oh I see, I thought ResNet-12 (pre) means the ResNet-12 without any fine-tuning. By 'first fine-tuning" step I mean "(a) large-scale DNN training" step.

For table 1, did you first conduct the "(a) large-scale DNN training" step?

Yeah I understand by loading the pre-trained models, we have to drop the classifier parameters and only use the encoder parameters. This is like domain-finetuning steps, adapting the pre-trained model...

> > For table 1, did you first conduct the "(a) large-scale DNN training" step? > > Yes. In the caption, you can see "ResNet-12 (pre)" is applied. I see,...

Hi @yaoyao-liu , I have an additional question, if we don't run the large-scale DNN training step, and just run the experiment with "SS[Θ;θ], HT meta-batch", will the performance be...

I found https://github.com/liuyukid/transformers-ner/blob/master/models/bert_ner.py#L110-L111 That the author change those labels with -100 to 0. And use attention mask as the mask for crf. However, this will add those tokens that are...

oh sorry there is n_gpu parameter and I can change the CUDA_VISIBLE_DEVICES...