ArXiv TLDR

UsefulBench: Towards Decision-Useful Information as a Target for Information Retrieval

🐦 Tweet
2604.15827

Tobias Schimanski, Stefanie Lewandowski, Christian Woerle, Nicola Reichenau, Yauheni Huryn + 1 more

cs.IRcs.CL

TLDR

UsefulBench introduces a dataset focusing on practical usefulness, not just relevance, in information retrieval tasks.

Key contributions

  • Defines and distinguishes relevance from practical usefulness in IR.
  • Presents UsefulBench, a dataset labeled by experts for relevance and usefulness.
  • Shows classic IR favors relevance; LLMs partially address usefulness but lack domain expertise.
  • Highlights challenges for IR systems to target decision-useful information effectively.

Why it matters

This paper shifts IR focus from mere relevance to practical usefulness, crucial for real-world decision-making. UsefulBench enables development of IR systems better aligned with user needs.

Original Abstract

Conventional information retrieval is concerned with identifying the relevance of texts for a given query. Yet, the conventional definition of relevance is dominated by aspects of similarity in texts, leaving unobserved whether the text is truly useful for addressing the query. For instance, when answering whether Paris is larger than Berlin, texts about Paris being in France are relevant (lexical/semantic similarity), but not useful. In this paper, we introduce UsefulBench, a domain-specific dataset curated by three professional analysts labeling whether a text is connected to a query (relevance) or holds practical value in responding to it (usefulness). We show that classic similarity-based information retrieval aligns more strongly with relevance. While LLM-based systems can counteract this bias, we find that domain-specific problems require a high degree of expertise, which current LLMs do not fully incorporate. We explore approaches to (partially) overcome this challenge. However, UsefulBench presents a dataset challenge for targeted information retrieval systems.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.