Wayne Xin Zhao
6 papers ยท Latest:
ClawGym: A Scalable Framework for Building Effective Claw Agents
ClawGym introduces a scalable framework for developing Claw-style agents, including a synthetic dataset, trained models, and an evaluation benchmark.
Improving Vision-language Models with Perception-centric Process Reward Models
Perceval is a new process reward model that improves vision-language models by providing token-level supervision to identify and correct perceptual errors.
ArbGraph: Conflict-Aware Evidence Arbitration for Reliable Long-Form Retrieval-Augmented Generation
ArbGraph improves long-form RAG reliability by pre-generating evidence arbitration, resolving factual conflicts before text generation.
Toward Autonomous Long-Horizon Engineering for ML Research
AiScientist is a new system for autonomous long-horizon ML research engineering, using hierarchical orchestration and a File-as-Bus workspace for durable state continuity.
Towards Long-horizon Agentic Multimodal Search
LMM-Searcher enables long-horizon multimodal search by offloading visual data to files, using UIDs, and achieving SOTA performance.
InCoder-32B-Thinking: Industrial Code World Model for Thinking
InCoder-32B-Thinking generates expert reasoning traces for industrial code by combining error-driven chain-of-thought with a hardware-aware world model.
๐ฌ Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week โ summarized, scored, and delivered to your inbox every Monday.