Wenhu Chen

5 papers · Latest: April 30, 2026

Visual Generation in the New Era: An Evolution from Atomic Mapping to Agentic World Modeling

This paper proposes a new five-level taxonomy for visual generation, shifting from appearance synthesis to intelligent, agentic world modeling.

2604.28185Apr 30, 2026

Computer Vision

Tuna-2: Pixel Embeddings Beat Vision Encoders for Multimodal Understanding and Generation

Tuna-2 is a unified multimodal model using pixel embeddings for understanding and generation, outperforming vision encoders and simplifying architecture.

2604.24763Apr 27, 2026

Artificial Intelligence

RationalRewards: Reasoning Rewards Scale Visual Generation Both Training and Test Time

RationalRewards uses explicit, multi-dimensional critiques to improve visual generation at both training and test time, outperforming scalar rewards.

2604.11626Apr 13, 2026

Natural Language Processing

ClawBench: Can AI Agents Complete Everyday Online Tasks?

ClawBench introduces a real-world benchmark of 153 online tasks across 144 live platforms, revealing current AI agents struggle with everyday web automation.

2604.08523Apr 9, 2026

Natural Language Processing

Explanations from Large Language Models Make Small Reasoners Better

This paper shows how explanations generated by large language models can be used to train smaller, more efficient models that achieve superior reasoning accuracy and generate high-quality explanations.

2210.06726Oct 13, 2022

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.