Yongliang Shen
6 papers ยท Latest:
Pause or Fabricate? Training Language Models for Grounded Reasoning
GRIL is a new RL framework that trains LLMs to detect incomplete information, pause, and clarify, reducing fabrication and improving grounded reasoning.
SpatialEvo: Self-Evolving Spatial Intelligence via Deterministic Geometric Environments
SpatialEvo uses deterministic geometric environments to enable self-evolving 3D spatial reasoning, outperforming existing methods by generating precise, physically valid training data.
UI-Zoomer: Uncertainty-Driven Adaptive Zoom-In for GUI Grounding
UI-Zoomer adaptively zooms into GUI elements based on prediction uncertainty, improving localization for small icons and dense layouts without retraining.
ClawGUI: A Unified Framework for Training, Evaluating, and Deploying GUI Agents
ClawGUI is an open-source framework that unifies training, evaluation, and deployment for GUI agents, addressing key infrastructure bottlenecks.
Seeing but Not Thinking: Routing Distraction in Multimodal Mixture-of-Experts
This paper identifies "Seeing but Not Thinking" in multimodal MoE models, where visual inputs cause routing distraction, and proposes an intervention.
SKILL0: In-Context Agentic Reinforcement Learning for Skill Internalization
SKILL0 is an in-context RL framework that internalizes agent skills into LLM parameters, enabling zero-shot autonomous behavior.
๐ฌ Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week โ summarized, scored, and delivered to your inbox every Monday.