Jun Yan

4 papers · Latest: May 11, 2026

RubricEM: Meta-RL with Rubric-guided Policy Decomposition beyond Verifiable Rewards

RubricEM is a meta-RL framework that uses rubrics to guide policy decomposition and reflection for training research agents without verifiable rewards.

2605.10899May 11, 2026

Artificial Intelligence

SkillOS: Learning Skill Curation for Self-Evolving Agents

SkillOS enables LLM agents to self-evolve by learning to curate reusable skills from experience, outperforming baselines in various tasks.

2605.06614May 7, 2026

Natural Language Processing

Generating Effective CoT Traces for Mitigating Causal Hallucination

This paper introduces a pipeline to generate Chain-of-Thought traces and a new metric (CHR) to mitigate causal hallucination in smaller LLMs.

2604.12748Apr 14, 2026

Populations & Evolution

Synonymous Codon Usage Bias Overrides Phylogeny to Reflect Convergent Frond Architecture in a Rapidly Radiating Fern Family Thelypteridaceae

Ferns show that synonymous codon usage bias (CUB) can override phylogeny, reflecting convergent frond architecture driven by specific photosynthesis genes.

2604.03028Apr 3, 2026

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.