Sharan Narang
5 papers ยท Latest:
The Llama 3 Herd of Models
Llama 3 is a new family of large multilingual foundation models excelling in language, coding, reasoning, and multimodal tasks, rivaling GPT-4 in quality and offering extensive public releases.
Llama 2: Open Foundation and Fine-Tuned Chat Models
Llama 2 introduces a range of open-source large language models, including fine-tuned chat models that outperform existing open-source alternatives in benchmarks and human evaluations.
PaLM: Scaling Language Modeling with Pathways
PaLM is a 540-billion parameter Transformer language model that achieves state-of-the-art few-shot learning performance across diverse benchmarks, demonstrating significant benefits from scaling.
Self-Consistency Improves Chain of Thought Reasoning in Language Models
Self-consistency is a new decoding strategy that improves chain-of-thought reasoning in language models by sampling diverse reasoning paths and selecting the most consistent answer.
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
This paper introduces a unified text-to-text framework for transfer learning in NLP, achieving state-of-the-art results across diverse language tasks by systematically exploring pre-training and fine-tuning strategies.
๐ฌ Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week โ summarized, scored, and delivered to your inbox every Monday.