Appear2Meaning: A Cross-Cultural Benchmark for Structured Cultural Metadata Inference from Images

April 8, 20262604.07338

Yuechen Jiang, Enze Zhang, Md Mohsinul Kabir, Qianqian Xie, Stavroula Golfomitsou + 2 more

cs.CVcs.CLcs.MM

TLDR

Introduces Appear2Meaning, a cross-cultural benchmark to evaluate VLMs' ability to infer structured cultural metadata from images, revealing current limitations.

Key contributions

Introduces Appear2Meaning, a new cross-cultural benchmark for structured cultural metadata inference.
Evaluates VLMs using an LLM-as-Judge framework for semantic alignment with cultural annotations.
Reveals VLMs struggle with consistent and grounded predictions across cultures and metadata types.

Why it matters

This paper addresses a critical gap in VLM capabilities beyond basic image captioning. By highlighting current models' limitations in cultural reasoning, it paves the way for future research to develop more culturally aware AI systems, crucial for heritage applications.

Original Abstract

Recent advances in vision-language models (VLMs) have improved image captioning for cultural heritage. However, inferring structured cultural metadata (e.g., creator, origin, period) from visual input remains underexplored. We introduce a multi-category, cross-cultural benchmark for this task and evaluate VLMs using an LLM-as-Judge framework that measures semantic alignment with reference annotations. To assess cultural reasoning, we report exact-match, partial-match, and attribute-level accuracy across cultural regions. Results show that models capture fragmented signals and exhibit substantial performance variation across cultures and metadata types, leading to inconsistent and weakly grounded predictions. These findings highlight the limitations of current VLMs in structured cultural metadata inference beyond visual perception.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers