opencompass icon indicating copy to clipboard operation
opencompass copied to clipboard

[Bug] VLLM推理时参数不匹配

Open c-box opened this issue 1 year ago • 3 comments

先决条件

  • [x] 我已经搜索过 问题讨论 但未得到预期的帮助。
  • [x] 错误在 最新版本 中尚未被修复。

问题类型

我正在使用官方支持的任务/模型/数据集进行评估。

环境

{'CUDA available': True,
 'CUDA_HOME': '/usr/local/cuda',
 'GCC': 'gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0',
 'GPU 0,1,2,3,4,5,6,7': 'NVIDIA H20',
 'MMEngine': '0.10.7',
 'MUSA available': False,
 'NVCC': 'Cuda compilation tools, release 12.1, V12.1.105',
 'OpenCV': '4.11.0',
 'PyTorch': '2.5.1+cu124',
 'PyTorch compiling details': 'PyTorch built with:\n'
                              '  - GCC 9.3\n'
                              '  - C++ Version: 201703\n'
                              '  - Intel(R) oneAPI Math Kernel Library Version '
                              '2024.2-Product Build 20240605 for Intel(R) 64 '
                              'architecture applications\n'
                              '  - Intel(R) MKL-DNN v3.5.3 (Git Hash '
                              '66f0cb9eb66affd2da3bf5f8d897376f04aae6af)\n'
                              '  - OpenMP 201511 (a.k.a. OpenMP 4.5)\n'
                              '  - LAPACK is enabled (usually provided by '
                              'MKL)\n'
                              '  - NNPACK is enabled\n'
                              '  - CPU capability usage: AVX512\n'
                              '  - CUDA Runtime 12.4\n'
                              '  - NVCC architecture flags: '
                              '-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_90,code=sm_90\n'
                              '  - CuDNN 90.1\n'
                              '  - Magma 2.6.1\n'
                              '  - Build settings: BLAS_INFO=mkl, '
                              'BUILD_TYPE=Release, CUDA_VERSION=12.4, '
                              'CUDNN_VERSION=9.1.0, '
                              'CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, '
                              'CXX_FLAGS= -D_GLIBCXX_USE_CXX11_ABI=0 '
                              '-fabi-version=11 -fvisibility-inlines-hidden '
                              '-DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO '
                              '-DLIBKINETO_NOROCTRACER -DLIBKINETO_NOXPUPTI=ON '
                              '-DUSE_FBGEMM -DUSE_PYTORCH_QNNPACK '
                              '-DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE '
                              '-O2 -fPIC -Wall -Wextra -Werror=return-type '
                              '-Werror=non-virtual-dtor -Werror=bool-operation '
                              '-Wnarrowing -Wno-missing-field-initializers '
                              '-Wno-type-limits -Wno-array-bounds '
                              '-Wno-unknown-pragmas -Wno-unused-parameter '
                              '-Wno-strict-overflow -Wno-strict-aliasing '
                              '-Wno-stringop-overflow -Wsuggest-override '
                              '-Wno-psabi -Wno-error=old-style-cast '
                              '-Wno-missing-braces -fdiagnostics-color=always '
                              '-faligned-new -Wno-unused-but-set-variable '
                              '-Wno-maybe-uninitialized -fno-math-errno '
                              '-fno-trapping-math -Werror=format '
                              '-Wno-stringop-overflow, LAPACK_INFO=mkl, '
                              'PERF_WITH_AVX=1, PERF_WITH_AVX2=1, '
                              'TORCH_VERSION=2.5.1, USE_CUDA=ON, USE_CUDNN=ON, '
                              'USE_CUSPARSELT=1, USE_EXCEPTION_PTR=1, '
                              'USE_GFLAGS=OFF, USE_GLOG=OFF, USE_GLOO=ON, '
                              'USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, '
                              'USE_NCCL=1, USE_NNPACK=ON, USE_OPENMP=ON, '
                              'USE_ROCM=OFF, USE_ROCM_KERNEL_ASSERT=OFF, \n',
 'Python': '3.10.16 (main, Dec 11 2024, 16:24:50) [GCC 11.2.0]',
 'TorchVision': '0.20.1+cu124',
 'lmdeploy': "not installed:No module named 'lmdeploy'",
 'numpy_random_seed': 2147483648,
 'opencompass': '0.4.1+',
 'sys.platform': 'linux',
 'transformers': '4.49.0'}

重现问题 - 代码/配置示例

from mmengine.config import read_base

with read_base():
    from opencompass.configs.datasets.livecodebench.livecodebench_o1_gen_f0ed6c import LCB_datasets
    from opencompass.configs.models.deepseek.hf_deepseek_r1_distill_qwen_1_5b import models as hf_deepseek_r1_distill_qwen_1_5b_models
    
    

datasets = [
    * LCB_datasets
]

models = sum([v for k, v in locals().items() if k.endswith('_models')], [])

其中hf_deepseek_r1_distill_qwen_1_5b的配置修改如下:

max_len = 32768
type=HuggingFacewithChatTemplate,
        abbr='deepseek-r1-distill-qwen-1.5b-hf',
        path='hf_models/DeepSeek-R1-Distill-Qwen-1.5B',
        max_seq_len=max_len+1024,
        max_out_len=max_len,
        gen_config=dict(
                        do_sample=True,
                        temperature=0.6,
                        top_p=0.95,
                        max_new_tokens=max_len),
        batch_size=64,
        run_cfg=dict(num_gpus=1),
        pred_postprocessor=dict(type=extract_non_reasoning_content)

重现问题 - 命令或脚本

opencompass custom_example/eval_custom.py -a vllm

重现问题 - 错误信息

推理的时候的vllm的参数似乎和模型配置的gen_config不一致,log中会出现,这里的基本都是默认参数:

SamplingParams(n=1, presence_penalty=0.0, frequency_penalty=0.0, repetition_penalty=1.0, temperature=0, top_p=1.0, top_k=-1, min_p=0.0, seed=None, stop=['<|end▁of▁sentence|>'], stop_token_ids=[], bad_words=[], include_stop_str_in_output=False, ignore_eos=False, max_tokens=8192, min_tokens=0, logprobs=None, prompt_logprobs=None, skip_special_tokens=True, spaces_between_special_tokens=True, truncate_prompt_tokens=None, guided_decoding=None)

然后config的配置里面也把相关参数丢掉了

hf_deepseek_r1_distill_qwen_1_5b_models=[
    dict(abbr='deepseek-r1-distill-qwen-1.5b-hf',
        batch_size=64,
        gen_config=dict(
            do_sample=True,
            max_new_tokens=32768,
            temperature=0.6,
            top_p=0.95),
        max_out_len=32768,
        max_seq_len=32768,
        path='hf_models/DeepSeek-R1-Distill-Qwen-1.5B',
        pred_postprocessor=dict(
            type='opencompass.utils.text_postprocessors.extract_non_reasoning_content'),
        run_cfg=dict(
            num_gpus=1),
        type='opencompass.models.HuggingFacewithChatTemplate'),
    ]
models=[
    dict(abbr='deepseek-r1-distill-qwen-1.5b-vllm',
        batch_size=16,
        max_out_len=32768,
        max_seq_len=32768,
        model_kwargs=dict(
            max_model_len=32768,
            tensor_parallel_size=1),
        path='hf_models/DeepSeek-R1-Distill-Qwen-1.5B',
        run_cfg=dict(
            num_gpus=1),
        stop_words=[
            ],
        type='opencompass.models.vllm_with_tf_above_v4_33.VLLMwithChatTemplate'),
    ]

其他信息

No response

c-box avatar Mar 12 '25 08:03 c-box

When your model type is HF like HuggingFacewithChatTemplate, do not use gen_config to pass the model parameters, use model_kwargs instead.

liushz avatar Mar 21 '25 11:03 liushz

你解决了吗?

SefaZeng avatar Apr 22 '25 11:04 SefaZeng

I encountered the same issue. You can refer to my solution at https://github.com/open-compass/opencompass/issues/2027#issuecomment-3435966223

Aatrox103 avatar Oct 29 '25 09:10 Aatrox103