Congmin(Xavier) Qiu issues

Results 8 issues of


                                            Congmin(Xavier) Qiu

why shared_one_hot initialize [... ,0] to be 1?

line 25 in this file : https://github.com/tristandeleu/ntm-one-shot/blob/master/mann/utils/init.py is there some special reason to initialize read weights like this?

[tool] fix: add tool name validation before execution

## Summary - Add early validation of tool names in `_call_tool()` method - Return informative error message listing available tools when model calls non-existent tool - Prevents KeyError and provides...

[trainer] feat: make max_colocate_count configurable in ResourcePoolManager

## Summary ✅ **Verified on Google Colab** - Add `max_colocate_count` field to `ResourcePoolManager` dataclass - Add Ray version check using `packaging` library (requires >= 2.39.0 for max_colocate_count > 1) -...

[vllm, rollout] fix: lazy initialize ZMQ to avoid event loop error

## 🐛 Problem When running Online DPO or SPIN training, the program crashes during initialization with: ```python RuntimeError: no running event loop at verl/workers/rollout/vllm_rollout/vllm_rollout_spmd.py:575 ``` ### Root Cause The `vLLMAsyncRollout.__init__()`...

[agent_loop] fix: ensure weight sync regardless of free_cache_engine

## 🐛 Problem When `free_cache_engine=False`, weight synchronization between actor and rollout is completely skipped, causing: - Rollout model weights never update after first epoch - Extreme off-policy training (rollout uses...

[sglang, rollout] fix: use right padding for response_position_ids

Fixes #4159 - Changed response_position_ids padding from left to right - Ensures alignment with response_ids for variable-length sequences - Critical for 2D position_ids in multimodal models (e.g., Qwen2-VL) - Added...

[trainer] fix: use unscaled logits for accurate pearson correlation metric

## Summary Fixes #4162 This PR ensures the `rollout_actor_probs_pearson_corr` metric accurately reflects the correlation between rollout and actor probabilities by computing log probabilities from unscaled logits for metrics calculation. ##...

[Draft,Don't reivew][fsdp_workers] fix: Skip FSDP loading in async rollout mode to save GPU memory

## Summary Fixes #4229 This PR optimizes GPU memory usage in async rollout mode by skipping unnecessary FSDP model loading. **Memory Savings**: ~50% for rollout workers (e.g., 14GB vs 22GB...