Daniel Y. Fu
3 papers ยท Latest:
Machine Learning
Search Your Block Floating Point Scales!
ScaleSearch optimizes Block Floating Point quantization scales by searching for minimal error, significantly improving generative model performance.
2605.12464
Machine LearningParcae: Scaling Laws For Stable Looped Language Models
Parcae is a stable looped language model architecture that resolves instability, improves quality, and provides scaling laws for efficiency.
2604.12946
Machine LearningFlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
FlashAttention is an IO-aware exact attention algorithm that significantly speeds up Transformer training and enables longer context lengths by optimizing GPU memory access patterns.
2205.14135
๐ฌ Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week โ summarized, scored, and delivered to your inbox every Monday.