Yongliang Shen

6 papers · Latest: April 21, 2026

Pause or Fabricate? Training Language Models for Grounded Reasoning

GRIL is a new RL framework that trains LLMs to detect incomplete information, pause, and clarify, reducing fabrication and improving grounded reasoning.

2604.19656Apr 21, 2026

Computer Vision

SpatialEvo: Self-Evolving Spatial Intelligence via Deterministic Geometric Environments

SpatialEvo uses deterministic geometric environments to enable self-evolving 3D spatial reasoning, outperforming existing methods by generating precise, physically valid training data.

2604.14144Apr 15, 2026

Computer Vision

UI-Zoomer: Uncertainty-Driven Adaptive Zoom-In for GUI Grounding

UI-Zoomer adaptively zooms into GUI elements based on prediction uncertainty, improving localization for small icons and dense layouts without retraining.

2604.14113Apr 15, 2026

Machine Learning

ClawGUI: A Unified Framework for Training, Evaluating, and Deploying GUI Agents

ClawGUI is an open-source framework that unifies training, evaluation, and deployment for GUI agents, addressing key infrastructure bottlenecks.

2604.11784Apr 13, 2026

Computer Vision

Seeing but Not Thinking: Routing Distraction in Multimodal Mixture-of-Experts

This paper identifies "Seeing but Not Thinking" in multimodal MoE models, where visual inputs cause routing distraction, and proposes an intervention.

2604.08541Apr 9, 2026

Machine Learning

SKILL0: In-Context Agentic Reinforcement Learning for Skill Internalization

SKILL0 is an in-context RL framework that internalizes agent skills into LLM parameters, enabling zero-shot autonomous behavior.

2604.02268Apr 2, 2026

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.