LiMa-cas
LiMa-cas
thanks in advance,Hi i have a question about the model. So in your script I see u use the FP16 model instead of INT4 model, could qlora use the quantizationed...
hi what the transformers version for? I use the latest which is 4.46.2,but got the problems :TypeError: LlamaRotaryEmbedding.forward() got an unexpected keyword argument 'seq_len',which maybe need a old version of...
Hi,但是我用来微调的 `NCCL_P2P_DISABLE=1 NCCL_IB_DISABLE=1 CUDA_VISIBLE_DEVICES=0,1,2,3 torchrun --nproc_per_node 4 --nnodes 1 --node_rank 0 --master_addr localhost --master_port 6666 ../finetune_llama3.py --model_name_or_path "/extra_data/mali36/GAOTONG/AWQMODEL/llama3-8B-instruct-awq" --data_path "../data/Belle_sampled_qwen.json" --bf16 True --output_dir "../output/llama3_8B_instruct_awq_qlora" --num_train_epochs 100 --per_device_train_batch_size 1 --per_device_eval_batch_size 1...
Hi,我把上面的--load_in_4bit 去掉了,然后又遇到新的问题: 
Thanks very much. I might not have expressed myself clearly last time. 1. why block fine-tuning is worse than global fine-tuning? 2. the inference time instead of quantization is much...
Thanks a lot!
Hi, when I use pv-tuning for AWQ model,firstly I need to  but I encountered a bug:  could u help me?
Hi,problems solved with using 80GB
Hi,another question,if I want to quantization for 4 bit, is the following parameters right?( --scale_nbits=4 ) but in the process, curent_avg_bits is 2.8 python main.py $MODEL_PATH $DATASET_PATH \ --nsamples=1024 \...
Hi,thanks a lot. There are 2 script called "finetune.py". U mean "global fine-tuning", is this the one called "finetune.py" as follows?  but main.py above using the "finetune.py" under src:...