Artidoro Pagnoni
Artidoro Pagnoni
Undo is a central part of the development process which involves trying and reverting if needed. Having a global undo makes it very difficult for two people to work on...
Hello @shashiongithub I am also having trouble downloading the dataset. After rerunning the script > 75 times I still have 11 articles that cannot be downloaded. I would like to...
Verifying with the original output: https://github.com/nlpyang/PreSumm, it seems like the model uses `` tokens for separation between sentences in the decoded outputs. So this seems to be problem in the...
I found a temporary solution: the double space seems to be indicating the missing period.
@klausmh the main limitation however, is that the mapping cannot be stateful. It can only map rows to new rows with no external state for the mapping. For example, it...
I also got this when using `decapoda-research/llama-7b-hf`. With another hf conversion (more recent I think) I did not get the problem. I recommend using newer conversions if possible. It looks...
Thank you @KKcorps! I also just replicated your fix and it seems to properly store the adapter checkpoints.
Hello! It's a great idea to be able to pass GenerationConfig to the Seq2SeqTrainer. However, it would be great to have a matching `GenerationArguments` class that allows parsing. Right now...
Ideally huggingface does the parsing for us. We should stay away from deciding what is local and what is on the hub. Also isn't this handled by `load_dataset`? https://huggingface.co/docs/datasets/package_reference/loading_methods#datasets.load_dataset Please...
I haven't seen that error before, I would suggest using one GPU for debugging as this might be related to DDP. One A100 should easily fit the 7B model. https://discuss.pytorch.org/t/ddp-and-gradient-checkpointing/132244