Jack Shi Wei Lun
Jack Shi Wei Lun
Hi, I am trying to quantize my custom fine-tuned deepseek-7b instruct model, and I am unable to to do. I followed the document: ``` # Convert to fp16 fp16 =...
### Prerequisites - [X] I have read the [documentation](https://hf.co/docs/autotrain). - [X] I have checked other issues for similar problems. ### Backend Local ### Interface Used CLI ### CLI Command --model_max_length...
As per title, ``` optimizer: adamw_torch ``` How can I tweak the beta1 and beta2 values in my .yml file? @abhishekkrthakur wondering if this is possible. thanks!
Hi all, Thanks for the wonderful work. I am currently running code_bert_score to evaluate the similarity between generated code and 'correct' code. However, it just takes way too long locally....
### Prerequisites - [X] I have read the [documentation](https://hf.co/docs/autotrain). - [X] I have checked other issues for similar problems. ### Backend Local ### Interface Used CLI ### CLI Command _No...
### Prerequisites - [X] I have read the [documentation](https://hf.co/docs/autotrain). - [X] I have checked other issues for similar problems. ### Backend Local ### Interface Used CLI ### CLI Command -...
The configs gave ``` target_modules: all-linear ``` Is this equivalent to the code below? ``` target_modules: "q_proj,v_proj,o_proj,k_proj,gate_proj,down_proj,up_proj" ``` Will there be any differences between the either option? **Is that the...
### Prerequisites - [X] I have read the [documentation](https://hf.co/docs/autotrain). - [X] I have checked other issues for similar problems. ### Backend Local ### Interface Used CLI ### CLI Command ```...
As mentioned in title, for GRPO, changing QLoRA to LoRA didn't affect VRAM When I change num_gen from 4 to 8, it did not affect any VRAM. When I change...
For GRPO, the VRAM usage will increase throughout the steps. Initially, I am at 36GB VRAM (maybe the first 10 steps), but will always crash after 600-700 steps when the...