ArXiv TLDR

GazeMind: A Gaze-Guided LLM Agent for Personalized Cognitive Load Assessment

🐦 Tweet
2605.05790

Bin Wang, Yue Liu, Benjamin Newman, Ajoy S. Fernandes, Zhiyuan Wang + 7 more

cs.HC

TLDR

GazeMind is a gaze-guided LLM agent for personalized, interpretable cognitive load assessment on smart glasses, outperforming baselines by over 20%.

Key contributions

  • GazeMind uses a gaze-guided LLM agent for personalized cognitive load assessment on smart glasses.
  • Encodes eye-tracking data into structured representations for interpretable LLM-based reasoning.
  • Generalizes across tasks without fine-tuning and adapts to users via historical data.
  • Introduces CogLoad-Bench, a large dataset (152 participants, 40+ hrs) for gaze-based cognitive load.

Why it matters

Current smart glasses lack cognitive load awareness, preventing proactive user assistance. Existing methods are impractical or fail to generalize. GazeMind offers an interpretable, generalizable, and personalized solution, enabling AI to anticipate user needs.

Original Abstract

Smart glasses with AI assistants are increasingly used in daily life. However, current systems lack awareness of the user's internal cognitive state, leaving them unable to proactively anticipate users' needs without access to cognitive load. Existing methods for assessing cognitive load either rely on impractical sensors for lightweight eyewear or utilize eye gaze-based models that suffer from poor interpretability, and require task-specific fine-tuning, often failing to generalize across individuals. We propose GazeMind, a gaze-guided LLM agent framework for cognitive load assessment on smart glasses. It encodes eye-tracking data into structured representations for LLM-based reasoning and provides interpretable cognitive load predictions. Importantly, GazeMind generalizes across scenarios without LLM fine-tuning through a novel task-guidance reasoning approach and achieves personalized adaptation by incorporating user-specific characteristics and historical references. To support evaluation, we introduce CogLoad-Bench, the largest gaze-based cognitive load dataset with 152 participants, 40+ hours of multimodal data, and 10K+ real-time annotations across controlled and real-world tasks. Experiments show that GazeMind achieves state-of-the-art performance, outperforming baselines by over 20% across all metrics.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.