[Electra] Convert Electra pretrained checkpoint into Huggingface pytorch model
I have pretrained Electra in the Arabic language. I got final_loss=8.84 at the end of pretraining, and everything has gone pretty well. Now, I want to use it as a PyTorch huggingface model. Also, I tried to convert it by the convert_electra_original_tf_checkpoint_to_pytorch script (here) and it failed because the pretrained checkpoint is not same as the original one (google). How should I load this model as a huggingface PyTorch model ?!
You only need to extract the generator or discriminator from the pretrained checkpoint for conversion. Typically the discriminator. Follow steps listed here to extract the individual parts.
- Step 6 of quick start guide: https://github.com/NVIDIA/DeepLearningExamples/tree/master/TensorFlow2/LanguageModeling/ELECTRA#quick-start-guide script being used
Once extracted, you should be able to point the the discriminator for example using <checkpoint>/discriminator here
You only need to extract the generator or discriminator from the pretrained checkpoint for conversion. Typically the discriminator. Follow steps listed here to extract the individual parts.
- Step 6 of quick start guide: https://github.com/NVIDIA/DeepLearningExamples/tree/master/TensorFlow2/LanguageModeling/ELECTRA#quick-start-guide script being used
Once extracted, you should be able to point the the discriminator for example using
<checkpoint>/discriminatorhere
Postprocess's result is .h5 file.
Huggingface's source can convert TF1 checkpoint to PyTorch (Nvidia's checkpoint is TF2)
So, I can't find any solution that convert TF2 to PyTorch.
A idea is that convert TF2 checkpoint to TF1. Then, convert TF1 to PyTorch from Huggingface's source. But converting TF2 ckpt to TF1 is so difficult to me