Lucile Saulnier
Lucile Saulnier
## Environment info ``` - `transformers` version: 4.17.0 - Platform: Linux-5.4.144+-x86_64-with-Ubuntu-18.04-bionic - Python version: 3.7.12 - PyTorch version (GPU?): 1.10.0+cu111 (False) - Tensorflow version (GPU?): 2.8.0 (False) - Flax version...
## Describe the bug I can't retrieve a cached dataset with offline mode enabled ## Steps to reproduce the bug To reproduce my issue, first, you'll need to run a...
I think it could be useful to have a `Sequence` object for (post)-processors. Indeed, I think a tokenizer might need to combine the `ByteLevel` post-processor and a `TemplateProcessing` processor. Today...
Proposal to add the Zenodo DOI Badge in the README. The DOI used here corresponds to the Concept DOI (versus the Version DOIs), which represents the concept of the software...
I seem to have seen this request more than once on `transformers`, many users would like to be able to continue training a tokenizer on a new dataset (see for...
# Question When creating a tokenizer with `add_prefix_space=True` and `trim_offsets=True` (for example with `ByteLevel` or `RobertaProcessing`), the offsets returned on a text starting with a text are not what I...
It might be useful to be able to add a new argument to load the checkpoint from a specific step. Currently, the only way to do it - to the...
We could, for example, parallelise what can be parallelised
multiple_search_dict calls take as argument the list of elements to search in resources. Sometimes there are elements that have the same fhirpath -> no need to compute twice!