reward-modeling topic
tasksource
Datasets collection and preprocessings framework for NLP extreme multitask learning
DMoERM
[ACL2024 Findings]DMoERM: Recipes of Mixture-of-Experts for Effective Reward Modeling
IterComp
[ICLR 2025] IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation
vector-inference
Efficient LLM inference on Slurm clusters using vLLM.
RewardModelingBeyondBradleyTerry
official implementation of ICLR'2025 paper: Rethinking Bradley-Terry Models in Preference-based Reward Modeling: Foundations, Theory, and Alternatives
Science-T2I
[CVPR 2025] Science-T2I: Addressing Scientific Illusions in Image Synthesis
learning-from-rewards-llm-papers
A comrephensive collection of learning from rewards in the post-training and test-time scaling of LLMs, with a focus on both reward models and learning strategies across training, inference, and post-...
qa_metrics
An easy python package to run quick basic QA evaluations. This package includes standardized QA evaluation metrics and semantic evaluation metrics: Black-box and Open-Source large language model promp...
hybrid-preferences
Learning to route instances for Human vs AI Feedback (ACL Main '25)
LongRM
Revealing and unlocking the context boundary of reward models