Pierre Guillou comments

Results 11 comments of


                                            Pierre Guillou

Update the logic of misspell identification

> The current logic of misspell identification relies on vocab.txt from the transformer model. For not so common words tokenizers breaks them into subwords and hence the original entire word...

Expose vocab in Python API

Hi @kcarnold. > I hacked around this by reading the `arpa` file. Can you publish your code for doing that? Thanks.

the method get_onnx_model() should not need the path to original model

> it should be `model_name`, not `model_name_or_path` I agree. This is exactly my point: the name of the model is important to get back our ONNX model with `get_onnx_model()`, not...

ModuleNotFoundError: No module named 'transformers.configuration_auto'

Hello @subhamkhemka. I guess you use a recent version of transformers (4.11.3 is the actual version)? Unfortunately, I think [onnx_transformers](https://github.com/patil-suraj/onnx_transformers) is no longer up to date (see this [post](https://discuss.huggingface.co/t/new-pipeline-for-zero-shot-text-classification/681/70) of...

Inference with nielsr/lilt-xlm-roberta-base

Hi @NielsRogge, Thank you but you consider in your code that I already have the corresponding bounding boxes as input, but I don't. I have only the image of a...

How to decrease inference time of LiLT?

Issue opened in the Optimum library: https://github.com/huggingface/optimum/issues/1024

Est-il possible de fine-tuner le modèle pré-entrainé sur un nouveau dataset pour de la génération de texte ?

Bonjour, En ce qui concerne le LM général, je vous conseille d'utiliser directement mon 3ème modèle qui est plus performant (il utilise la configuration MultiFit, alors que le premier utilise...

Est-il possible de fine-tuner le modèle pré-entrainé sur un nouveau dataset pour de la génération de texte ?

> La partie fine-tuning sur lm3-french-classifier-amazon.ipynb est bien celle comprenant Fine-tuning "forward LM" et Fine-tuning "backward LM" ? Oui, c'est cela (forward et backward pour entraîner un LM bidirectionnel). 1....

Est-il possible de fine-tuner le modèle pré-entrainé sur un nouveau dataset pour de la génération de texte ?

Il est certain qu'il faut passer à des Transformers du type BERT pour la génération de texte. Et si seul BERT a été entraîné en français, ça vaut le coup...

Est-il possible de fine-tuner le modèle pré-entrainé sur un nouveau dataset pour de la génération de texte ?

Autre conseil: regarder aussi [lm3-portuguese.ipynb](https://github.com/piegu/language-models/blob/master/lm3-portuguese.ipynb) et [lm3-portuguese-classifier-TCU-jurisprudencia.ipynb](https://github.com/piegu/language-models/blob/master/lm3-portuguese-classifier-TCU-jurisprudencia.ipynb) qui utilisent toutes les techniques MultiFiT et en particulier Label Smoothing.