Same Image, Different Meanings: Toward Retrieval of Context-Dependent Meanings
Ayuto Tsutsumi, Ryosuke Kohita
TLDR
This paper explores retrieving context-dependent image meanings, observing that semantic abstraction dictates how much context is needed for accurate retrieval.
Key contributions
- Developed an L1-L4 framework to categorize image semantics from context-independent to context-dependent.
- Observed that abstract image meanings require narrative context for retrieval, unlike concrete elements.
- Found that injecting context on the image side significantly improves retrieval of abstract meanings.
- Identified that the most abstract semantic level remains a challenge for context-dependent retrieval.
Why it matters
This paper addresses a crucial limitation in current image retrieval systems by tackling context-dependent meanings. Its L1-L4 framework and findings lay groundwork for more sophisticated systems that can interpret images within narrative settings, paving the way for more nuanced and human-like image search.
Original Abstract
A scene of two people in the rain can convey hope and warmth in a reunion story or sorrow and finality in a farewell story. We investigate this context-dependent nature of image meaning and its implications for retrieval. Our key observation is that context dependency correlates with semantic abstraction: concrete elements (objects, actions) remain stable across contexts, while abstract elements (atmosphere, intent) shift with context. We operationalize this as the L1--L4 framework, organizing image semantics from context-independent (L1) to maximally context-dependent (L4). Using synthetic story contexts and queries for controlled evaluation, we examine how injecting narrative context into embeddings affects retrieval across abstraction levels. Concrete queries are retrievable without context, while abstract levels increasingly depend on narrative grounding. Where context is injected also matters, with image-side enrichment proving particularly effective. The most abstract level, however, remains challenging even with full context, highlighting context-dependent image retrieval as an important open problem. Our framework and findings lay groundwork toward retrieval systems that handle the context-dependent meanings images acquire in narrative settings.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.