Noam Shazeer

3 papers · Latest: April 5, 2022

PaLM: Scaling Language Modeling with Pathways

PaLM is a 540-billion parameter Transformer language model that achieves state-of-the-art few-shot learning performance across diverse benchmarks, demonstrating significant benefits from scaling.

2204.02311Apr 5, 2022

Machine Learning

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

This paper introduces a unified text-to-text framework for transfer learning in NLP, achieving state-of-the-art results across diverse language tasks by systematically exploring pre-training and fine-tuning strategies.

1910.10683Oct 23, 2019

Natural Language Processing

Attention Is All You Need

The paper introduces the Transformer, a novel neural network architecture based solely on attention mechanisms that outperforms traditional recurrent and convolutional models in sequence transduction tasks like machine translation.

1706.03762Jun 12, 2017

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.