Hanghang Tong
3 papers ยท Latest:
Natural Language Processing
RubricEM: Meta-RL with Rubric-guided Policy Decomposition beyond Verifiable Rewards
RubricEM is a meta-RL framework that uses rubrics to guide policy decomposition and reflection for training research agents without verifiable rewards.
2605.10899
Artificial IntelligenceRecursive Multi-Agent Systems
RecursiveMAS scales multi-agent collaboration by casting the system as a unified latent-space recursive computation, improving performance and efficiency.
2604.25917
Information RetrievalPAPERMIND: Benchmarking Agentic Reasoning and Critique over Scientific Papers in Multimodal LLMs
PAPERMIND is a new benchmark evaluating multimodal LLMs' integrated reasoning and critique over scientific papers across diverse domains.
2604.21304
๐ฌ Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week โ summarized, scored, and delivered to your inbox every Monday.