ArXiv TLDR

Luke Zettlemoyer

7 papers ยท Latest:

Natural Language Processing

Fast Byte Latent Transformer

The Fast Byte Latent Transformer (BLT) introduces novel training and generation techniques to significantly speed up byte-level language models.

2605.08044
Computer Vision

Tuna-2: Pixel Embeddings Beat Vision Encoders for Multimodal Understanding and Generation

Tuna-2 is a unified multimodal model using pixel embeddings for understanding and generation, outperforming vision encoders and simplifying architecture.

2604.24763
Natural Language Processing

Micro Language Models Enable Instant Responses

Micro LMs (8M-30M params) enable instant, contextually grounded responses on edge devices by initiating replies while cloud models complete them.

2604.19642
Natural Language Processing

LIMA: Less Is More for Alignment

LIMA shows that fine-tuning a large language model on just 1,000 curated examples can achieve performance comparable to state-of-the-art models, highlighting the dominant role of pretraining over extensive instruction tuning.

2305.11206
Natural Language Processing

Toolformer: Language Models Can Teach Themselves to Use Tools

Toolformer enables language models to autonomously learn to use external tools via APIs, significantly enhancing their performance on diverse tasks without extra supervision.

2302.04761
Natural Language Processing

OPT: Open Pre-trained Transformer Language Models

OPT is a suite of openly released large-scale transformer language models comparable to GPT-3 but developed with significantly lower environmental impact.

2205.01068
Natural Language Processing

RoBERTa: A Robustly Optimized BERT Pretraining Approach

RoBERTa revisits BERT pretraining with optimized hyperparameters and more data, achieving state-of-the-art NLP performance and revealing that BERT was originally undertrained.

1907.11692

๐Ÿ“ฌ Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week โ€” summarized, scored, and delivered to your inbox every Monday.