Generative Universal Verifier as Multimodal
Meta-Reasoner

Introduction

We introduce Generative Universal Verifier, a novel concept and plugin designed for next-generation multimodal reasoning in vision-language models and unified multimodal models, providing the fundamental capability of reflection and refinement on visual outcomes during the reasoning and generation process.

ViVerBench: a comprehensive benchmark spanning 16 categories of critical tasks for evaluating visual outcomes in multimodal reasoning.
OmniVerifier-7B: Trained on large-scale visual verification data, the first omni-capable generative verifier trained for universal visual verification and achieves notable gains on ViVerBench(+8.3).
OmniVerifier-TTS, a sequential test-time scaling paradigm that leverages the universal verifier to bridge image generation and editing within unified models, enhancing the upper bound of generative ability through iterative fine-grained optimization.

OmniVerifier advances both reliable reflection during generation and scalable test-time refinement, marking a step toward more trustworthy and controllable next-generation reasoning systems.

New Updates

[2025.11] Inference code of two automated pipelines for visual verifier data construction are released.

[2025.10] Inference code of Sequential OmniVerifier-TTS (based on Qwen-Image) is released.

[2025.10] Evaluation code of ViVerBench is released.

[2025.10] Training code of OmniVerifier is released.

TODO

[x] Two automated data construction pipelines
[ ] Sequential OmniVerifier-TTS on different backbones
[ ] Parallel OmniVerifier-TTS

Installation

git clone https://github.com/Cominclip/OmniVerifier.git
cd OmniVerifier
pip install -e .

Quick Start: Generated Image Verification

Use the following command to test OmniVerifier-7B on a generated image:

python inference.py

Please modify image_path and prompt to your own setting.

The model will output both an answer and an explanation indicating whether the image is strictly aligned with the given prompt.

Part1: ViVerBench Evaluation

We provide two evaluation approaches: rule-based and model-based. As a first step, store the model outputs in a JSON file such as your_model.json.

For rule-based evaluation:

python viverbench_eval_rule_based.py --model_response your_model.json

For model-based evaluation, we use GPT-4.1 as the judge model:

python viverbench_eval_model_based.py --model_response your_model.json

Part2: OmniVerifier RL Training

We apply DAPO to directly train Qwen2.5VL-7B without cold start:

bash examples/qwen2_5_vl_7b_dapo.sh

After training, you should merge checkpoint in Hugging Face format:

python3 scripts/model_merger.py --local_dir checkpoints/omniverifier/exp_name/global_step_1/actor

Part3: OmniVerifier-TTS

We provide the code for sequential Omniverifier-TTS using Qwen-Image. You should first generate the step0 image and use this script for iteratively self-refine:

python sequential_omniverifier_tts.py

Citation

@article{zhang2025generative,
  title={Generative Universal Verifier as Multimodal Meta-Reasoner},
  author={Zhang, Xinchen and Zhang, Xiaoying and Wu, Youbin and Cao, Yanbin and Zhang, Renrui and Chu, Ruihang and Yang, Ling and Yang, Yujiu},
  journal={arXiv preprint arXiv:2510.13804},
  year={2025}
}

Acknowledgements

OmniVerifier is builded upon several solid works. Thanks to EasyR1 and veRL for their wonderful work and codebase!

OmniVerifier
OmniVerifier copied to clipboard

Metadata

Generative Universal Verifier as Multimodal
Meta-Reasoner

Introduction

New Updates

TODO

Installation

Quick Start: Generated Image Verification

Part1: ViVerBench Evaluation

Part2: OmniVerifier RL Training

Part3: OmniVerifier-TTS

Citation

Acknowledgements

← Metadata

Owner

Metadata

OmniVerifier OmniVerifier copied to clipboard

Metadata

Generative Universal Verifier as Multimodal Meta-Reasoner

Introduction

New Updates

TODO

Installation

Quick Start: Generated Image Verification

Part1: ViVerBench Evaluation

Part2: OmniVerifier RL Training

Part3: OmniVerifier-TTS

Citation

Acknowledgements

← Metadata

Owner

Metadata

OmniVerifier
OmniVerifier copied to clipboard

Generative Universal Verifier as Multimodal
Meta-Reasoner