Alex Ray

2 papers · Latest: March 4, 2022

Training language models to follow instructions with human feedback

This paper presents InstructGPT, a method to align language models with user intent by fine-tuning GPT-3 using human feedback, resulting in more truthful, helpful, and less toxic outputs.

2203.02155Mar 4, 2022

Machine Learning

Evaluating Large Language Models Trained on Code

Codex, a GPT model fine-tuned on GitHub code, significantly outperforms prior models in generating correct Python programs from docstrings, demonstrating strong code synthesis capabilities.

2107.03374Jul 7, 2021

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.