Weinan Zhang

4 papers · Latest: May 13, 2026

SWE-Cycle: Benchmarking Code Agents across the Complete Issue Resolution Cycle

SWE-Cycle introduces a new benchmark and SWE-Judge evaluation system to accurately assess autonomous code agents across the complete software issue resolution cycle.

2605.13139May 13, 2026

Artificial Intelligence

PRTS: A Primitive Reasoning and Tasking System via Contrastive Representations

PRTS is a VLA model that uses contrastive Goal-Conditioned RL to learn goal-reachability, significantly improving robot task execution and long-horizon planning.

2604.27472Apr 30, 2026

Information Retrieval

Modular Representation Compression: Adapting LLMs for Efficient and Effective Recommendations

MARC compresses LLM representations for recommendation systems by addressing mid-layer advantage, improving efficiency and effectiveness.

2604.18146Apr 20, 2026

Software Engineering

Externalization in LLM Agents: A Unified Review of Memory, Skills, Protocols and Harness Engineering

LLM agents increasingly externalize capabilities like memory, skills, and protocols into surrounding infrastructure, transforming how they solve complex tasks.

2604.08224Apr 9, 2026

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.