Yueting Zhuang
6 papers ยท Latest:
SpatialEvo: Self-Evolving Spatial Intelligence via Deterministic Geometric Environments
SpatialEvo uses deterministic geometric environments to enable self-evolving 3D spatial reasoning, outperforming existing methods by generating precise, physically valid training data.
UI-Zoomer: Uncertainty-Driven Adaptive Zoom-In for GUI Grounding
UI-Zoomer adaptively zooms into GUI elements based on prediction uncertainty, improving localization for small icons and dense layouts without retraining.
LMMs Meet Object-Centric Vision: Understanding, Segmentation, Editing and Generation
LMMs struggle with object-level tasks; this paper reviews how object-centric vision enhances LMMs for precise understanding, segmentation, editing, and generation.
ClawGUI: A Unified Framework for Training, Evaluating, and Deploying GUI Agents
ClawGUI is an open-source framework that unifies training, evaluation, and deployment for GUI agents, addressing key infrastructure bottlenecks.
Seeing but Not Thinking: Routing Distraction in Multimodal Mixture-of-Experts
This paper identifies "Seeing but Not Thinking" in multimodal MoE models, where visual inputs cause routing distraction, and proposes an intervention.
SKILL0: In-Context Agentic Reinforcement Learning for Skill Internalization
SKILL0 is an in-context RL framework that internalizes agent skills into LLM parameters, enabling zero-shot autonomous behavior.
๐ฌ Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week โ summarized, scored, and delivered to your inbox every Monday.