Wayne Xin Zhao

6 papers · Latest: April 29, 2026

ClawGym: A Scalable Framework for Building Effective Claw Agents

ClawGym introduces a scalable framework for developing Claw-style agents, including a synthetic dataset, trained models, and an evaluation benchmark.

2604.26904Apr 29, 2026

Computer Vision

Improving Vision-language Models with Perception-centric Process Reward Models

Perceval is a new process reward model that improves vision-language models by providing token-level supervision to identify and correct perceptual errors.

2604.24583Apr 27, 2026

Natural Language Processing

ArbGraph: Conflict-Aware Evidence Arbitration for Reliable Long-Form Retrieval-Augmented Generation

ArbGraph improves long-form RAG reliability by pre-generating evidence arbitration, resolving factual conflicts before text generation.

2604.18362Apr 20, 2026

Natural Language Processing

Toward Autonomous Long-Horizon Engineering for ML Research

AiScientist is a new system for autonomous long-horizon ML research engineering, using hierarchical orchestration and a File-as-Bus workspace for durable state continuity.

2604.13018Apr 14, 2026

Computer Vision

Towards Long-horizon Agentic Multimodal Search

LMM-Searcher enables long-horizon multimodal search by offloading visual data to files, using UIDs, and achieving SOTA performance.

2604.12890Apr 14, 2026

InCoder-32B-Thinking: Industrial Code World Model for Thinking

InCoder-32B-Thinking generates expert reasoning traces for industrial code by combining error-driven chain-of-thought with a hardware-aware world model.

2604.03144Apr 3, 2026

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.