Chenxin Li
2 papers ยท Latest:
Natural Language Processing
Towards On-Policy Data Evolution for Visual-Native Multimodal Deep Search Agents
ODE enhances multimodal deep search agents via an image bank for reusable visual evidence and on-policy data evolution, improving performance significantly.
2605.10832
Software EngineeringClaw-Eval-Live: A Live Agent Benchmark for Evolving Real-World Workflows
Claw-Eval-Live is a live benchmark for LLM agents, evaluating their performance on evolving real-world workflows with verifiable execution.
2604.28139
๐ฌ Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week โ summarized, scored, and delivered to your inbox every Monday.