Hanghang Tong

3 papers · Latest: May 11, 2026

RubricEM: Meta-RL with Rubric-guided Policy Decomposition beyond Verifiable Rewards

RubricEM is a meta-RL framework that uses rubrics to guide policy decomposition and reflection for training research agents without verifiable rewards.

2605.10899May 11, 2026

Artificial Intelligence

Recursive Multi-Agent Systems

RecursiveMAS scales multi-agent collaboration by casting the system as a unified latent-space recursive computation, improving performance and efficiency.

2604.25917Apr 28, 2026

Information Retrieval

PAPERMIND: Benchmarking Agentic Reasoning and Critique over Scientific Papers in Multimodal LLMs

PAPERMIND is a new benchmark evaluating multimodal LLMs' integrated reasoning and critique over scientific papers across diverse domains.

2604.21304Apr 23, 2026

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.