Aspect-Aware Content-Based Recommendations for Mathematical Research Papers
Ankit Satpute, André Greiner-Petter, Noah Gießing, Olaf Teschke, Moritz Schubotz + 2 more
TLDR
This paper introduces AchGNN, an aspect-conditioned GNN, and new datasets for content-based mathematical research paper recommendations, outperforming prior methods.
Key contributions
- Expert study reveals mathematical paper relevance is inherently aspect-driven, unlike explicit textual similarity.
- Introduced GoldRiM and SilverRiM, the first datasets for aspect-aware content-based recommendations in mathematics.
- Proposed AchGNN, an aspect-conditioned GNN, modeling text, citations, and author lineage for math papers.
- AchGNN significantly outperforms prior methods and generalizes to machine learning publications.
Why it matters
This work addresses a critical gap in content-based recommendations for mathematics, where existing methods fail. By introducing aspect-aware datasets and a novel GNN, it provides a more effective way to connect complex mathematical papers. This improves discovery for mathematicians and demonstrates generalizability beyond the domain.
Original Abstract
Content-based research paper recommendation (CbRPR) has seen advances in computer science and biomedicine, but remains unexplored for mathematics, where paper relatedness is more conceptual than explicit textual or citation-based similarity. Mathematics papers may be connected through shared proof techniques, logical implications, or natural generalizations, yet exhibit minimal textual or citation overlap, rendering existing CbRPR ineffective. To address this gap, we first conduct an expert-driven study characterizing mathematical recommendations, revealing that relevance is inherently \textit{aspect}-driven. Grounded in this insight, we introduce GoldRiM (small, expert-annotated) and SilverRiM (large, automatically derived), the first datasets for \textit{aspect}-aware CbRPR in mathematics. Recognizing that LLM embeddings of mathematical content alone yield suboptimal representation, we propose AchGNN, an \textit{aspect}-conditioned heterogeneous GNN that jointly models textual semantics, citation structure, and author lineage. Across GoldRiM and SilverRiM, AchGNN consistently outperforms prior \textit{aspect}-based CbRPR methods, achieving substantial gains across all evaluated \textit{aspects}. We conduct ablation studies to analyze the contributions of individual \textit{aspect} supervision, authorship lineage, and graph-structural signals to AchGNN's performance. To assess domain generality, we further evaluate AchGNN on the \textit{Papers with Code} dataset of machine learning publications, demonstrating that our \textit{aspect}-aware approach effectively transfers beyond mathematics. We deploy our system on the MaRDI platform to help mathematicians with recommendations and release datasets and code publicly for reproducibility.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.