Jack Shi Wei Lun issues

Results 11 issues of


                                            Jack Shi Wei Lun

Custom fine-tuned DeepSeek coder model unable to be quantized to Fp16

Hi, I am trying to quantize my custom fine-tuned deepseek-7b instruct model, and I am unable to to do. I followed the document: ``` # Convert to fp16 fp16 =...

bug-unconfirmed

[BUG] Why does increasing model_max_length result in fine-tuning not working?

### Prerequisites - [X] I have read the [documentation](https://hf.co/docs/autotrain). - [X] I have checked other issues for similar problems. ### Backend Local ### Interface Used CLI ### CLI Command --model_max_length...

bug

Is it possible to tweak the adamw_torch optimizer to change Beta1 and Beta2 values?

As per title, ``` optimizer: adamw_torch ``` How can I tweak the beta1 and beta2 values in my .yml file? @abhishekkrthakur wondering if this is possible. thanks!

stale

Extremely long time taken for comparison of codes

Hi all, Thanks for the wonderful work. I am currently running code_bert_score to evaluate the similarity between generated code and 'correct' code. However, it just takes way too long locally....

[BUG] Shouldnt the LoRA be the same float as the base model?

bug

[BUG] Any updates to errors due to Gradient Accumulation?

bug

Is it ok to write the target_modules in full?

The configs gave ``` target_modules: all-linear ``` Is this equivalent to the code below? ``` target_modules: "q_proj,v_proj,o_proj,k_proj,gate_proj,down_proj,up_proj" ``` Will there be any differences between the either option? **Is that the...

stale

[BUG] I am using 2 nodes of 4 GPUs each, (total 8GPUs), but num_machines is always set to 1.

bug

stale

[GRPO] Changing QLoRA to LoRA or increasing num_gen does not affect VRAM

As mentioned in title, for GRPO, changing QLoRA to LoRA didn't affect VRAM When I change num_gen from 4 to 8, it did not affect any VRAM. When I change...

[Bug] GRPO Memory Leak when training

For GRPO, the VRAM usage will increase throughout the steps. Initially, I am at 36GB VRAM (maybe the first 10 steps), but will always crash after 600-700 steps when the...

bug