Egor Bogomolov

2 papers · Latest: May 11, 2026

Step Rejection Fine-Tuning: A Practical Distillation Recipe

Step Rejection Fine-Tuning (SRFT) improves LLM agent training by leveraging partially correct, unresolved trajectories, outperforming standard RFT.

This paper introduces Source-Attributed BPE (SA-BPE) to regularize code tokenizers, reducing under-trained tokens caused by data imbalance.

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.