Hamid Palangi

2 papers · Latest: April 20, 2026

When Can LLMs Learn to Reason with Weak Supervision?

LLMs generalize under weak supervision when reward saturation is slow and reasoning is faithful, with SFT on traces being crucial.

2604.18574Apr 20, 2026

Natural Language Processing

Orca: Progressive Learning from Complex Explanation Traces of GPT-4

Orca is a 13B parameter model that improves small model reasoning by progressively learning from GPT-4's complex explanation traces and step-by-step thought processes, achieving state-of-the-art zero-shot performance on challenging benchmarks.

2306.02707Jun 5, 2023

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.