dshwei

Results 3 comments of dshwei

reduce gpu-memory-utilization value may also reslove this problem

When you want to use peft checkpoints as a continuing fine-tuning ,,PeftModel.from_pretrained's parameter is_trainable=True is required

vllm 0.7.1 torch 2.5.1 when use this version vllm , and set VLLM_TORCH_PROFILER_DIR=./traces/ command as following : VLLM_TORCH_PROFILER_DIR=./traces/ vllm serve /workspace/models/DeepSeek-V2-Lite-Chat \ --gpu-memory-utilization 0.80 \ --max-model-len 8000 \ --max-num-batched-tokens 32000...