Lu Mingcong comments

Results 9 comments of


                                            Lu Mingcong

problem about chapter04/car_rental.py

it's because poisson disstribution has an infinite upper limit, so the author just take a limit of 10 cars. what you said about execute 0-20, it is reflected in this...

RuntimeError: CUDA error: an illegal memory access was encountered

What's your torch version. I use torch 2.0 at first and got same problem, then I degraded it to 1.13.1 and works well. Hope helpful

FSDP QDoRa

> Hello, was just trying this out as well; Using the latest `peft` as suggested gets rid of the "cannot flatten integer dtype tensors" error. However, a new error now...

> Tried these > > ``` > pip uninstall peft > pip install git+https://github.com/huggingface/peft.git > ``` > > and getting the same error: > > ``` > rank1]: File "...../LLaMA-Factory/v/lib/python3.11/site-packages/peft/tuners/lora/dora.py",...

使用neat_packing进行sft训练，模型性能指标下降明显

Same problem in llamafactory(latest pull 2025.4.29) https://github.com/open-thoughts/open-thoughts/issues/30

Continue training error: No such file or directory zero_pp_rank_4_mp_rank_00_optim_states.pt

There is indeed no zero_pp_rank_4_mp_rank_00_optim_states.pt in .../checkpoint-1500/global_step1499/. What can I do ``` $ls checkpoint-1500/global_step1499/ bf16_zero_pp_rank_0_mp_rank_00_optim_states.pt bf16_zero_pp_rank_1_mp_rank_00_optim_states.pt bf16_zero_pp_rank_2_mp_rank_00_optim_states.pt bf16_zero_pp_rank_10_mp_rank_00_optim_states.pt bf16_zero_pp_rank_20_mp_rank_00_optim_states.pt bf16_zero_pp_rank_30_mp_rank_00_optim_states.pt bf16_zero_pp_rank_11_mp_rank_00_optim_states.pt bf16_zero_pp_rank_21_mp_rank_00_optim_states.pt bf16_zero_pp_rank_31_mp_rank_00_optim_states.pt bf16_zero_pp_rank_12_mp_rank_00_optim_states.pt bf16_zero_pp_rank_22_mp_rank_00_optim_states.pt bf16_zero_pp_rank_3_mp_rank_00_optim_states.pt bf16_zero_pp_rank_13_mp_rank_00_optim_states.pt bf16_zero_pp_rank_23_mp_rank_00_optim_states.pt bf16_zero_pp_rank_4_mp_rank_00_optim_states.pt bf16_zero_pp_rank_14_mp_rank_00_optim_states.pt...

Lu Mingcong

problem about chapter04/car_rental.py

RuntimeError: CUDA error: an illegal memory access was encountered

FSDP QDoRa

FSDP QDoRa

使用neat_packing进行sft训练，模型性能指标下降明显

Continue training error: No such file or directory zero_pp_rank_4_mp_rank_00_optim_states.pt

4bit-QLora + Qwen2 72b + 16k cutoff_len

4bit-QLora + Qwen2 72b + 16k cutoff_len

Qwen2-72B，16K长文本，convert转换为HF模型OOM