Chen Liang

2 papers · Latest: April 28, 2026

Beyond Screenshots: Evaluating VLMs' Understanding of UI Animations

This paper evaluates VLMs' understanding of UI animations using a new dataset, finding they detect motion but struggle with high-level interpretation.

2604.26148Apr 28, 2026

Natural Language Processing

Gemini: A Family of Highly Capable Multimodal Models

Gemini is a new family of multimodal AI models excelling in image, audio, video, and text understanding, achieving state-of-the-art results across numerous benchmarks including human-expert level on MMLU.

2312.11805Dec 19, 2023

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.