Xuebo Liu
3 papers ยท Latest:
Natural Language Processing
CoCoReviewBench: A Completeness- and Correctness-Oriented Benchmark for AI Reviewers
CoCoReviewBench is a new benchmark for AI reviewers, focusing on completeness and correctness by curating 3,900 papers with expert annotations.
2605.07905
Artificial IntelligenceMASPO: Joint Prompt Optimization for LLM-based Multi-Agent Systems
MASPO optimizes prompts for LLM multi-agent systems by jointly evaluating their impact on successor agents, improving collaborative task performance.
2605.06623
Artificial IntelligenceOGER: A Robust Offline-Guided Exploration Reward for Hybrid Reinforcement Learning
OGER is a new framework that enhances LLM exploration in RLVR by unifying offline guidance and online RL with an entropy-aware reward.
2604.18530
๐ฌ Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week โ summarized, scored, and delivered to your inbox every Monday.