ArXiv TLDR

SIMON: Saliency-aware Integrative Multi-view Object-centric Neural Decoding

🐦 Tweet
2605.00401

YuSheng Lin, Ji-Hwa Tsai, Chun-Shu Wei

cs.CVq-bio.NC

TLDR

SIMON introduces a saliency-aware multi-view framework for zero-shot EEG-to-image retrieval, achieving state-of-the-art performance by focusing on salient object regions.

Key contributions

  • Proposes SIMON, a saliency-aware multi-view framework for zero-shot EEG-to-image retrieval.
  • Uses Saliency-Aware Sampling (SAS) to select fixation centers based on foreground and saliency.
  • Generates foveated views that emphasize informative object regions, reducing background clutter.
  • Achieves state-of-the-art Top-1 accuracy on THINGS-EEG (69.7% intra-subject, 19.6% inter-subject).

Why it matters

This paper addresses a key limitation in EEG-to-image retrieval by aligning visual features with human attention. SIMON's novel saliency-aware approach significantly improves retrieval accuracy, making brain-computer interfaces more robust and effective. It paves the way for more natural and context-aware neural decoding applications.

Original Abstract

Recent EEG-to-image retrieval methods leverage pretrained vision encoders and foveation-inspired priors, but typically assume a fixed, center-focused view. This center bias conflicts with content-driven human attention, creating a geometric-semantic dissociation between visual features and EEG responses. We propose SIMON, a saliency-aware multi-view framework for zero-shot EEG-to-image retrieval. SIMON combines foreground segmentation and saliency prediction to select fixation centers via Saliency-Aware Sampling (SAS), then generates foveated views that emphasize informative object regions while suppressing background clutter. On THINGS-EEG, SIMON achieves state-of-the-art performance in both intra-subject and inter-subject settings, reaching an average Top-1 accuracy of 69.7% and 19.6%, respectively, consistently outperforming recent competitive baselines. Analyses across sampling granularity, EEG channel topology, and visual/brain encoder backbones further support the robustness of saliency-aware multi-view integration. Our code and models are publicly available at https://github.com/simonlink666/SIMON.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.