Mostafa Dehghani
3 papers ยท Latest:
Natural Language Processing
Gemini: A Family of Highly Capable Multimodal Models
Gemini is a new family of multimodal AI models excelling in image, audio, video, and text understanding, achieving state-of-the-art results across numerous benchmarks including human-expert level on MMLU.
2312.11805
Natural Language ProcessingTranscending Scaling Laws with 0.1% Extra Compute
UL2R fine-tuning significantly improves large language model performance and scaling efficiency with only 0.1% extra compute, enabling substantial computational savings and emergent abilities.
2210.11399
Computer VisionAn Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
This paper demonstrates that a pure Transformer model applied directly to image patches can achieve state-of-the-art image classification performance without relying on convolutional networks.
2010.11929
๐ฌ Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week โ summarized, scored, and delivered to your inbox every Monday.