Ali Farhadi
5 papers ยท Latest:
MolmoAct2: Action Reasoning Models for Real-world Deployment
MolmoAct2 is an open-source action reasoning model for robots, featuring a specialized VLM, new datasets, and efficient architecture for real-world deployment.
VideoNet: A Large-Scale Dataset for Domain-Specific Action Recognition
VideoNet is a new large-scale dataset and benchmark for domain-specific action recognition, revealing VLM struggles and proposing a novel training approach.
Posterior Augmented Flow Matching
PAFM enhances flow matching by using posterior-augmented supervision to reduce training variance, preventing flow collapse and improving generative models.
Seeing Fast and Slow: Learning the Flow of Time in Videos
This paper introduces self-supervised models to detect and manipulate video playback speed, enabling temporal super-resolution and speed-conditioned video generation.
MolmoWeb: Open Visual Web Agent and Open Data for the Open Web
MolmoWeb introduces open visual web agents and a large, diverse dataset (MolmoWebMix) that achieve state-of-the-art performance on web-use benchmarks.
๐ฌ Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week โ summarized, scored, and delivered to your inbox every Monday.