Saadia Gabriel

2 papers · Latest: April 20, 2026

When Can LLMs Learn to Reason with Weak Supervision?

LLMs generalize under weak supervision when reward saturation is slow and reasoning is faithful, with SFT on traces being crucial.

2604.18574Apr 20, 2026

Artificial Intelligence

SUPERNOVA: Eliciting General Reasoning in LLMs with Reinforcement Learning on Natural Instructions

SUPERNOVA is a data curation framework that uses RL with natural instructions to significantly improve LLM general reasoning by adapting existing instruction-tuning datasets.

2604.08477Apr 9, 2026

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.