Soonhwan-Kwon
Soonhwan-Kwon
> You can try the conversion script in the `tf_to_pytorch` directory. It seems that leads to error that it can't find the missing keys
@authman @chenshen03 I finally make it work with tensorflow==1.14.0, and it works perfectly Thank you
Thank you for finding this error, we had no time to confirm it until now, you helps our project very much, thank you! I can also confirmed it today, again...
I'm also interested in deberta-mt implementation, and there is grey area for unilm implementation for example how we can implement disentangled attention, how did author dealt with relative position bias...
I encountered the same situation in customized model, and it makes me feel stuck, because when you turn on apex amp as fp16 backend then you can't use zero.
 test result 'giant panda , chengdu , china '
   
some are good but some are bad, and it needs to be fine-tuned with COCO dataset as the CoCa paper for better result. and I'm evaluating the scores on COCO...
It is much slower implementation because it is w/o past_key_values but I expect it to be much more faster w/ past_key_values. I wanted to move on step by step, because...
> It would be really cool if you could make finetuning called for plugging image embeddings into the coca text decoder and train only the decoder :) It sounds very...