DenceChen comments

Results 9 comments of


                                            DenceChen

i trained the new model,and modified the Num_class to 3,original 19, however, i got the results img fullfilled with pink color after i ran the inference, i don't know why?

@sainttelant Could you please tell me what flags you used when training? my loss arround 2.0

When the network reaches convergence, The loss has been around 2.0

@hellochick can you show your train args? your read.me do not have training introduction

Error while adding mask_zero=True

keras_contrib crf not support masking

Issue with predict function from a saved model

I have the same problem

good framework

this framework need a lot GPU memory, so if you optimization this part will be a prefect framework

qwen3:30b-a3b SFT with lora-rank as 16 is very very slow

大家能分享下自己的训练脚本不，我这个脚本切换a3b模型就跑不动了，我这是8卡A800的啊，每卡有80G显存的，醉了 accelerate launch \ --main_process_port 25515 \ --config_file ./scripts/config.yaml \ ./src/train.py \ --stage sft \ --do_train True \ --model_name_or_path ${model_path} \ --dataset $train_ds \ --dataset_dir /opt/nas/p/learning_platform/zouyapeng/docsum/LLaMA-Factory/data \ --template qwen3 \...

Is there a way to turn off Qwen3 32B thinking mode and add a thinking budget?

tokenizer_config.json -> chat_template -> {%- if enable_thinking is not defined or enable_thinking is false %}

special token未能输出

151648: AddedToken("", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 151649: AddedToken("", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True) 是不是special被设置为True的原因？我看天然带这个标签的模型special设置的为False

special token未能输出

@Alice1998 @qubingxin @katouHui 我解决了，不要加special token 加普通token /root/dence/gendata/LLaMA-Factory/src/llamafactory/model/patcher.py ``` def patch_tokenizer(tokenizer: "PreTrainedTokenizer", model_args: "ModelArguments") -> None: if "PreTrainedTokenizerBase" not in str(tokenizer._pad.__func__): tokenizer._pad = MethodType(PreTrainedTokenizerBase._pad, tokenizer) if model_args.model_max_length is not None and...