Rui Li

2 papers · Latest: May 12, 2026

StepCodeReasoner: Aligning Code Reasoning with Stepwise Execution Traces via Reinforcement Learning

StepCodeReasoner uses RL to align code reasoning with stepwise execution traces, achieving SOTA performance by supervising intermediate states.

2605.11922May 12, 2026

Natural Language Processing

Stochasticity in Tokenisation Improves Robustness

Stochastic tokenization during pre-training and fine-tuning significantly improves large language model robustness against adversarial and random perturbations.

2604.16037Apr 17, 2026

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.