ArmelRandy

Results 42 comments of ArmelRandy

Hi. In `finetune.py`, you fine-tune starcoder of a dataset containing a set of sentences framed as `Question:\n\nAnswer:`. The loss is causal and in this case it considers all the tokens...

You can remove that argument in `Trainer`, or change where you want to report your logs, e.g.` report_to=["tensorboard"]`. In this case you will have to make sure that it is...

The fine-tuning script, i.e. `finetune.py` is designed to fine-tune Starcoder to map an input text to an output text . If you have a dataset which follows that template (or...

Hi. It is possible to control the output of the generation by adding stop words. The generation will stop once any of the stop word is encountered. By default, the...

Hi. Thank you for pointing out. The issue should be resolved now (just pull the changes locally). It is because the argument `prompt_type` is not supported by the parser so...

Hi. I think it depends on the logging frequency. Try to reduce the logging_steps parameter and tell me if it solves your issue.

You should try to update transformers (>= 4.31.0.dev0) , accelerate (>=0.21.0.dev0) and bitsandbytes. And instead of loading in 8 bit, try to use ``` model = AutoModelForCausalLM.from_pretrained("bigcode/starcoder", device_map="auto", load_in_4bit=True) ```

You may want to read this [blogpost](https://huggingface.co/blog/accelerate-large-models) to understand how to run large models with the help of [accelerate](https://github.com/huggingface/accelerate). You can load model with the following code ```python import torch...

Hi, I am not able to reproduce the error on my side. I tried with a `single_prompt.txt` that I created myself but it probably has significantly less tokens than yours....

You can look at the [hardware requirements](https://github.com/bigcode-project/starcoder#inference-hardware-requirements) for starcoder. Try Loading the model in 8bit with the code provided there.