ArXiv TLDR

CiteRadar: A Citation Intelligence Platform for Researcher Profiling and Geographic Visualization

🐦 Tweet
2604.25057

Chenxu Niu, Yiming Sun

cs.LGcs.DLcs.HCcs.IR

TLDR

CiteRadar is an open-source platform that provides detailed citation intelligence, researcher profiling, and geographic visualization from a Google Scholar ID.

Key contributions

  • Robust Scholar meta-string parser resilient to Unicode non-breaking-space errors.
  • Two-stage author disambiguation system preventing significant h-index attribution errors.
  • OpenAlex URL fix that boosts city-level location data for authors from 0% to ~60%.
  • Interactive, logarithmically-scaled Folium world map with per-city researcher popups.

Why it matters

This paper introduces CiteRadar, an open-source tool addressing the lack of accessible platforms for detailed citation analysis. It helps researchers understand their geographic impact and collaboration opportunities, crucial for career development and grant applications.

Original Abstract

Understanding the geographic reach and community structure of one's scholarly citations is increasingly valuable for career development, grant applications, and collaboration discovery -- yet accessible tools for answering these questions remain scarce. Existing bibliometric platforms either require costly institutional subscriptions or expose only aggregate citation counts without granular per-author metadata. We present CiteRadar, an open-source system that accepts a single Google Scholar user identifier and automatically produces a structured output folder containing: the author's complete publication list, all retrieved citing papers with enriched author metadata, two ranked author tables (by citation frequency and by h-index), a plain-text statistical summary, and a self-contained interactive HTML world map -- all from a single command-line invocation. CiteRadar integrates five heterogeneous data sources -- Google Scholar, OpenAlex, CrossRef, Semantic Scholar, and OpenStreetMap Nominatim -- through a carefully engineered five-stage pipeline. Key technical contributions include: (1) a Scholar meta-string parser resilient to Unicode non-breaking-space separators, a pervasive but undocumented quirk in Scholar's HTML that silently corrupts venue and year fields when unhandled; (2) a two-stage author disambiguation system using stop-word-filtered institution name similarity to guard against the well-known same-name entity-merging failure mode in bibliometric databases, demonstrated to eliminate h-index attribution errors of up to 9x the correct value; (3) an OpenAlex web-URL to API-URL conversion fix that raises the fraction of author records with city-level location data from 0% to ~60%; and (4) a logarithmically-scaled interactive Folium world map with per-city researcher popups, rendered as a fully self-contained HTML file.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.