ArXiv TLDR

Enhancing Unsupervised Keyword Extraction in Academic Papers through Integrating Highlights with Abstract

🐦 Tweet
2604.19505

Yi Xiang, Chengzhi Zhang

cs.IRcs.CLcs.DL

TLDR

This paper shows that integrating academic paper highlights with abstracts significantly improves unsupervised keyword extraction performance.

Key contributions

  • Introduces using academic paper 'highlights' to enhance unsupervised keyword extraction.
  • Evaluates three input scenarios: abstract-only, highlights-only, and combined approaches.
  • Shows combining highlights with abstract significantly improves extraction performance across datasets.
  • Analyzes keyword coverage differences between abstracts and highlights and their impact.

Why it matters

This research offers a novel and effective approach to improve automatic keyword extraction, a crucial task in NLP and information retrieval. By leveraging the often-underutilized 'highlights' section, it provides a significant performance boost. This can lead to better paper indexing and discoverability.

Original Abstract

Automatic keyword extraction from academic papers is a key area of interest in natural language processing and information retrieval. Although previous research has mainly focused on utilizing abstract and references for keyword extraction, this paper focuses on the highlights section - a summary describing the key findings and contributions, offering readers a quick overview of the research. Our observations indicate that highlights contain valuable keyword information that can effectively complement the abstract. To investigate the impact of incorporating highlights into unsupervised keyword extraction, we evaluate three input scenarios: using only the abstract, the highlights, and a combination of both. Experiments conducted with four unsupervised models on Computer Science (CS), Library and Information Science (LIS) datasets reveal that integrating the abstract with highlights significantly improves extraction performance. Furthermore, we examine the differences in keyword coverage and content between abstract and highlights, exploring how these variations influence extraction outcomes. The data and code are available at https://github.com/xiangyi-njust/Highlight-KPE.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.