Sten Sootla
3 papers ยท Latest:
Software Engineering
ProgramBench: Can Language Models Rebuild Programs From Scratch?
ProgramBench evaluates language models' ability to holistically rebuild software from scratch, revealing current LMs struggle with architectural decisions.
2605.03546
Artificial IntelligenceThe Llama 3 Herd of Models
Llama 3 is a new family of large multilingual foundation models excelling in language, coding, reasoning, and multimodal tasks, rivaling GPT-4 in quality and offering extensive public releases.
2407.21783
Natural Language ProcessingCode Llama: Open Foundation Models for Code
Code Llama is a new family of open-source large language models specialized for coding tasks, achieving state-of-the-art results on multiple benchmarks with support for long contexts and code infilling.
2308.12950
๐ฌ Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week โ summarized, scored, and delivered to your inbox every Monday.