Unified Personalized Reward Model for Vision Generation
Yibin Wang
1,2
,
Yuhang Zang
4
,
Feng Han
1,2
,
Yujie Zhou
3,4
,
Jiazi Bu
3,4
,
Cheng Jin
1,2
,
Jiaqi Wang
2
1
Fudan University,
2
Shanghai Innovation Institute,
3
Shanghai Jiaotong University,
4
Shanghai AI Lab
Paper
UnifiedReward-Flex
Pref-GRPO
🤗
Checkpoints
🤗
Dataset
Image Generation Personalized Reasoning
Video Generation Personalized Reasoning
Reward Model Comparison
Text-to-Image Generation GRPO
Text-to-Video Generation GRPO
Training Progress Visualization
BibTeX
Video Comparison
2 guys talking near a big tree, animation style
Wan2.1-T2V-14B
GRPO w/UnifiedReward-Flex
Alien couple performing a massive concert in a violet cyberpunk world, vibrant, psychdellic 4k, 1080p
Wan2.1-T2V-14B
GRPO w/UnifiedReward-Flex
An Iron man is playing the electronic guitar, high electronic guitar
Wan2.1-T2V-14B
GRPO w/UnifiedReward-Flex
Origami dancers in white paper, 3D render, on white background, studio shot, dancing modern dance
Wan2.1-T2V-14B
GRPO w/UnifiedReward-Flex
Robot dancing in Times Square
Wan2.1-T2V-14B
GRPO w/UnifiedReward-Flex
all AI models fighting in mortal kombat
Wan2.1-T2V-14B
GRPO w/UnifiedReward-Flex
human girl talk to cute dragon, pixar, disney
Wan2.1-T2V-14B
GRPO w/UnifiedReward-Flex
unicorn running in the beautiful garden with rainbow
Wan2.1-T2V-14B
GRPO w/UnifiedReward-Flex