ViDeBERTa icon indicating copy to clipboard operation
ViDeBERTa copied to clipboard

Question about v3 pretraining code of DeBERTa

Open stefan-it opened this issue 3 years ago • 3 comments

Hi @DaoTranbk and @HyTruongSon,

many thanks for open sourcing the repo for ViDeBERTa!

I'm very interested in the v3 pretraining of a DeBERTa model. In the current version of the pretraining code, I can see that the normal DeBERTa package is called:

https://github.com/HySonLab/ViDeBERTa/blob/8270cceb4833bbfa13b4b4d9c4859968501a96be/pre-training/bash/pre-train_model.sh#L13

However, the publicly available DeBERTa code does not yet include the support of Gradient Disentangled Embedding Sharing (GDES), see e.g.: https://github.com/microsoft/DeBERTa/issues/93.

Did you modify the code to add support for GDES? I would highly be interested in that implementation.

Many thanks and cheers,

Stefan

stefan-it avatar Jan 29 '23 10:01 stefan-it

Any updates on this?

musabgultekin avatar Feb 20 '23 21:02 musabgultekin

Kindly pinging @DaoTranbk and @HyTruongSon.

musabgultekin avatar Feb 20 '23 21:02 musabgultekin

Thank @stefan-it for your interest in the v3 pretraining of DeBERTa.

In this work, we have modified the code of DeBERTa to add GDES in pretraining, following the DeBERTaV3 paper. If you are interested in that implementation, you can take a look on the latest v3 pretraining code at the original source: https://github.com/microsoft/DeBERTa.

Hope it can be helpful for you.

Regards, Cong Dao

DaoTranbk avatar Apr 24 '23 16:04 DaoTranbk