Haichao Zhu
Haichao Zhu
> Task-specific linear head is fine-tuned with prompt embeddings. The comparison of using LM head and the task-specific linear head is provided in our experiment (Table 5), which shows that...
> Yes, LM head can not be applied to sequence tagging as for now. Your observation on PT with SQuAD is quite interesting. Have you frozen the pre-trained models' parameters?...
@Xiao9905 Hi, could you share the hyperparameters and optimizer configuration used for PT2 SQuAD 1.1 Roberta-Large? Such as learning rate, prompt length, epoch or max steps, warmup rate, weight decay,...