Weinan Zhang
4 papers ยท Latest:
SWE-Cycle: Benchmarking Code Agents across the Complete Issue Resolution Cycle
SWE-Cycle introduces a new benchmark and SWE-Judge evaluation system to accurately assess autonomous code agents across the complete software issue resolution cycle.
PRTS: A Primitive Reasoning and Tasking System via Contrastive Representations
PRTS is a VLA model that uses contrastive Goal-Conditioned RL to learn goal-reachability, significantly improving robot task execution and long-horizon planning.
Modular Representation Compression: Adapting LLMs for Efficient and Effective Recommendations
MARC compresses LLM representations for recommendation systems by addressing mid-layer advantage, improving efficiency and effectiveness.
Externalization in LLM Agents: A Unified Review of Memory, Skills, Protocols and Harness Engineering
LLM agents increasingly externalize capabilities like memory, skills, and protocols into surrounding infrastructure, transforming how they solve complex tasks.
๐ฌ Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week โ summarized, scored, and delivered to your inbox every Monday.