Nanyun Peng
2 papers ยท Latest:
Natural Language Processing
LongMemEval-V2: Evaluating Long-Term Agent Memory Toward Experienced Colleagues
LongMemEval-V2 introduces a new benchmark to evaluate long-term agent memory for acquiring environment-specific experience in web environments.
2605.12493
Computer VisionOpenVLThinkerV2: A Generalist Multimodal Reasoning Model for Multi-domain Visual Tasks
OpenVLThinkerV2 introduces Gaussian GRPO and task-level shaping to create a robust multimodal reasoning model, outperforming existing models.
2604.08539
๐ฌ Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week โ summarized, scored, and delivered to your inbox every Monday.