Yanfei Chen
2 papers ยท Latest:
Natural Language Processing
RubricEM: Meta-RL with Rubric-guided Policy Decomposition beyond Verifiable Rewards
RubricEM is a meta-RL framework that uses rubrics to guide policy decomposition and reflection for training research agents without verifiable rewards.
2605.10899
Artificial IntelligenceSkillOS: Learning Skill Curation for Self-Evolving Agents
SkillOS enables LLM agents to self-evolve by learning to curate reusable skills from experience, outperforming baselines in various tasks.
2605.06614
๐ฌ Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week โ summarized, scored, and delivered to your inbox every Monday.