bash recipe/dapo/run_dapo_qwen3_8b_base_npu.sh报错
Traceback (most recent call last): File "/tmp/ray/session_2025-11-03_18-41-07_640704_1388/runtime_resources/working_dir_files/_ray_pkg_7965894561b2304f/recipe/dapo/main_dapo.py", line 33, in main run_ppo(config) File "/tmp/ray/session_2025-11-03_18-41-07_640704_1388/runtime_resources/working_dir_files/_ray_pkg_7965894561b2304f/recipe/dapo/main_dapo.py", line 62, in run_ppo ray.get(runner.run.remote(config)) File "/root/miniconda3/lib/python3.12/site-packages/ray/_private/auto_init_hook.py", line 22, in auto_init_wrapper return fn(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/lib/python3.12/site-packages/ray/_private/client_mode_hook.py", line 104, in wrapper return func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/lib/python3.12/site-packages/ray/_private/worker.py", line 2961, in get values, debugger_breakpoint = worker.get_objects( ^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/lib/python3.12/site-packages/ray/_private/worker.py", line 1026, in get_objects raise value.as_instanceof_cause() ray.exceptions.RayTaskError: ray::TaskRunner.run() (pid=12607, ip=172.17.0.16, actor_id=d8360131cf10cb61773b4a2702000000, repr=<main_dapo.TaskRunner object at 0x7fcdb6271a60>) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/tmp/ray/session_2025-11-03_18-41-07_640704_1388/runtime_resources/working_dir_files/_ray_pkg_7965894561b2304f/recipe/dapo/main_dapo.py", line 178, in run trainer.init_workers() File "/tmp/ray/session_2025-11-03_18-41-07_640704_1388/runtime_resources/working_dir_files/_ray_pkg_7965894561b2304f/verl/trainer/ppo/ray_trainer.py", line 766, in init_workers self.actor_rollout_wg.init_model() File "/tmp/ray/session_2025-11-03_18-41-07_640704_1388/runtime_resources/working_dir_files/_ray_pkg_7965894561b2304f/verl/single_controller/ray/base.py", line 48, in call output = ray.get(output) ^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^ ray.exceptions.RayTaskError: ray::WorkerDict.actor_rollout_init_model() (pid=20330, ip=172.17.0.16, actor_id=90a777fadb0625899b426f5202000000, repr=<verl.single_controller.ray.base.WorkerDict object at 0x7eccafe08620>) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/tmp/ray/session_2025-11-03_18-41-07_640704_1388/runtime_resources/working_dir_files/_ray_pkg_7965894561b2304f/verl/single_controller/ray/base.py", line 700, in func return getattr(self.worker_dict[key], name)(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/tmp/ray/session_2025-11-03_18-41-07_640704_1388/runtime_resources/working_dir_files/_ray_pkg_7965894561b2304f/verl/single_controller/base/decorator.py", line 442, in inner return func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/tmp/ray/session_2025-11-03_18-41-07_640704_1388/runtime_resources/working_dir_files/_ray_pkg_7965894561b2304f/verl/utils/transferqueue_utils.py", line 199, in dummy_inner return func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/tmp/ray/session_2025-11-03_18-41-07_640704_1388/runtime_resources/working_dir_files/_ray_pkg_7965894561b2304f/verl/workers/fsdp_workers.py", line 810, in init_model self._build_rollout(trust_remote_code=self.config.model.get("trust_remote_code", False)) File "/tmp/ray/session_2025-11-03_18-41-07_640704_1388/runtime_resources/working_dir_files/_ray_pkg_7965894561b2304f/verl/workers/fsdp_workers.py", line 618, in _build_rollout self.rollout = get_rollout_class(rollout_config.name, rollout_config.mode)( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/tmp/ray/session_2025-11-03_18-41-07_640704_1388/runtime_resources/working_dir_files/_ray_pkg_7965894561b2304f/verl/workers/rollout/vllm_rollout/vllm_rollout_spmd.py", line 219, in init self.inference_engine = LLM( ^^^^ File "/root/miniconda3/lib/python3.12/site-packages/vllm/entrypoints/llm.py", line 297, in init self.llm_engine = LLMEngine.from_engine_args( ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/lib/python3.12/site-packages/vllm/v1/engine/llm_engine.py", line 169, in from_engine_args vllm_config = engine_args.create_engine_config(usage_context) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/lib/python3.12/site-packages/vllm/engine/arg_utils.py", line 1142, in create_engine_config model_config = self.create_model_config() ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/lib/python3.12/site-packages/vllm/engine/arg_utils.py", line 994, in create_model_config return ModelConfig( ^^^^^^^^^^^^ File "/root/miniconda3/lib/python3.12/site-packages/pydantic/_internal/_dataclasses.py", line 121, in init s.pydantic_validator.validate_python(ArgsKwargs(args, kwargs), self_instance=s) pydantic_core._pydantic_core.ValidationError: 1 validation error for ModelConfig Value error, Model architectures ['Qwen3ForCausalLM'] failed to be inspected. Please check the logs for more details. [type=value_error, input_value=ArgsKwargs((), {'model': ...rocessor_plugin': None}), input_type=ArgsKwargs] For further information visit https://errors.pydantic.dev/2.12/v/value_error
Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.
Job 'raysubmit_hqxe53knDFBqJzb8' failed
Status message: Job entrypoint command failed with exit code 1, last available logs (truncated to 20,000 chars): File "/root/miniconda3/lib/python3.12/site-packages/vllm/engine/arg_utils.py", line 994, in create_model_config return ModelConfig( ^^^^^^^^^^^^ File "/root/miniconda3/lib/python3.12/site-packages/pydantic/_internal/_dataclasses.py", line 121, in init s.pydantic_validator.validate_python(ArgsKwargs(args, kwargs), self_instance=s) pydantic_core._pydantic_core.ValidationError: 1 validation error for ModelConfig Value error, Model architectures ['Qwen3ForCausalLM'] failed to be inspected. Please check the logs for more details. [type=value_error, input_value=ArgsKwargs((), {'model': ...rocessor_plugin': None}), input_type=ArgsKwargs] For further information visit https://errors.pydantic.dev/2.12/v/value_error
Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.
Information
- [x] The official example scripts
- [ ] My own modified scripts
Tasks
- [x] An officially supported task in the
examplesfolder (such as GLUE/SQuAD, ...) - [ ] My own task or dataset (give details below)
Reproduction
脚本同recipe/dapo/run_dapo_qwen3_8b_base_npu.sh
Expected behavior
how to solve it
Please provide detailed environment information.