Yanfei Chen

2 papers · Latest: May 11, 2026

RubricEM: Meta-RL with Rubric-guided Policy Decomposition beyond Verifiable Rewards

RubricEM is a meta-RL framework that uses rubrics to guide policy decomposition and reflection for training research agents without verifiable rewards.

2605.10899May 11, 2026

Artificial Intelligence

SkillOS: Learning Skill Curation for Self-Evolving Agents

SkillOS enables LLM agents to self-evolve by learning to curate reusable skills from experience, outperforming baselines in various tasks.

2605.06614May 7, 2026

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.