ArXiv TLDR

HalluCiteChecker: A Lightweight Toolkit for Hallucinated Citation Detection and Verification in the Era of AI Scientists

🐦 Tweet
2604.26835

Yusuke Sakai, Hidetaka Kamigaito, Taro Watanabe

cs.CLcs.AIcs.DL

TLDR

HalluCiteChecker is a lightweight, CPU-efficient toolkit designed to detect and verify hallucinated citations in scientific papers, reducing reviewer workload.

Key contributions

  • Formalizes hallucinated citation detection as an NLP task.
  • Provides a lightweight toolkit for fast, offline, CPU-only verification of citations.
  • Reduces reviewer workload and supports pre-review checks by automating validity.
  • Open-source (Apache 2.0) and distributed via PyPI for easy integration.

Why it matters

Hallucinated citations from AI assistants are a growing problem, eroding trust in scientific literature and burdening reviewers. HalluCiteChecker offers a crucial, practical solution by automating detection and verification. This toolkit helps maintain academic integrity and streamlines the publication process.

Original Abstract

We introduce HalluCiteChecker, a toolkit for detecting and verifying hallucinated citations in scientific papers. While AI assistant technologies have transformed the academic writing process, including citation recommendation, they have also led to the emergence of hallucinated citations that do not correspond to any existing work. Such citations not only undermine the credibility of scientific papers but also impose an additional burden on reviewers and authors, who must manually verify their validity during the review process. In this study, we formalize hallucinated citation detection as an NLP task and provide a corresponding toolkit as a practical foundation for addressing this problem. Our package is lightweight and can perform verification in seconds on a standard laptop. It can also be executed entirely offline and runs efficiently using only CPUs. We hope that HalluCiteChecker will help reduce reviewer workload and support organizers by enabling systematic pre-review and publication checks. Our code is released under the Apache 2.0 license on GitHub and is distributed as an installable package via PyPI. A demonstration video is available on YouTube.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.