Jun Yan
4 papers ยท Latest:
Natural Language Processing
RubricEM: Meta-RL with Rubric-guided Policy Decomposition beyond Verifiable Rewards
RubricEM is a meta-RL framework that uses rubrics to guide policy decomposition and reflection for training research agents without verifiable rewards.
2605.10899
Artificial IntelligenceSkillOS: Learning Skill Curation for Self-Evolving Agents
SkillOS enables LLM agents to self-evolve by learning to curate reusable skills from experience, outperforming baselines in various tasks.
2605.06614
Natural Language ProcessingGenerating Effective CoT Traces for Mitigating Causal Hallucination
This paper introduces a pipeline to generate Chain-of-Thought traces and a new metric (CHR) to mitigate causal hallucination in smaller LLMs.
2604.12748
Populations & EvolutionSynonymous Codon Usage Bias Overrides Phylogeny to Reflect Convergent Frond Architecture in a Rapidly Radiating Fern Family Thelypteridaceae
Ferns show that synonymous codon usage bias (CUB) can override phylogeny, reflecting convergent frond architecture driven by specific photosynthesis genes.
2604.03028
๐ฌ Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week โ summarized, scored, and delivered to your inbox every Monday.