ArXiv TLDR

AgentLens: Adaptive Visual Modalities for Human-Agent Interaction in Mobile GUI Agents

🐦 Tweet
2604.20279

Jeonghyeon Kim, Byeongjun Joung, Junwon Lee, Joohyung Lee, Taehoon Min + 1 more

cs.HCcs.AIcs.MA

TLDR

AgentLens introduces adaptive visual modalities for mobile GUI agents, balancing transparency and multitasking during human-agent interaction.

Key contributions

  • Introduces AgentLens, a mobile GUI agent with adaptive visual modalities.
  • Utilizes Full UI, Partial UI, and GenUI for just-in-time visual interaction.
  • Enables background execution with selective visual overlays via Virtual Display.
  • Achieved 85.7% user preference and high usability (1.94 PSSUQ) in studies.

Why it matters

Existing mobile GUI agents struggle to balance transparency and multitasking. AgentLens solves this by offering adaptive visual communication, allowing users to stay aware without sacrificing productivity. This improves the user experience and adoption potential of mobile automation.

Original Abstract

Mobile GUI agents can automate smartphone tasks by interacting directly with app interfaces, but how they should communicate with users during execution remains underexplored. Existing systems rely on two extremes: foreground execution, which maximizes transparency but prevents multitasking, and background execution, which supports multitasking but provides little visual awareness. Through iterative formative studies, we found that users prefer a hybrid model with just-in-time visual interaction, but the most effective visualization modality depends on the task. Motivated by this, we present AgentLens, a mobile GUI agent that adaptively uses three visual modalities during human-agent interaction: Full UI, Partial UI, and GenUI. AgentLens extends a standard mobile agent with adaptive communication actions and uses Virtual Display to enable background execution with selective visual overlays. In a controlled study with 21 participants, AgentLens was preferred by 85.7% of participants and achieved the highest usability (1.94 Overall PSSUQ) and adoption-intent (6.43/7).

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.