Weiming Ren
2 papers ยท Latest:
Computer Vision
Tuna-2: Pixel Embeddings Beat Vision Encoders for Multimodal Understanding and Generation
Tuna-2 is a unified multimodal model using pixel embeddings for understanding and generation, outperforming vision encoders and simplifying architecture.
2604.24763
Artificial IntelligenceRationalRewards: Reasoning Rewards Scale Visual Generation Both Training and Test Time
RationalRewards uses explicit, multi-dimensional critiques to improve visual generation at both training and test time, outperforming scalar rewards.
2604.11626
๐ฌ Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week โ summarized, scored, and delivered to your inbox every Monday.