Skip to content

gipplab/MathAspectRecSys

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

"Aspect-Aware Content-Based Recommendations for Mathematical Research Papers"

Ankit Satpute* , Andre Greiner-Petter , Noah Gießing , Olaf Teschke, Moritz Schubotz, Akiko Aizawa and Bela Gipp.

*Corresponding Author

🚀 Accepted to the Full Papers Track of SIGIR 2026

Abstract

This repository contains an aspect-aware content-based research paper recommendation system code and dataset designed specifically for mathematics, where relatedness is often conceptual rather than based on textual similarity or citation overlap. Unlike existing approaches that work well in domains such as computer science or biomedicine, this project addresses the unique challenges of mathematical literature by modeling connections through shared proof techniques, logical implications, and theoretical generalizations. The project introduces GoldRiM and SilverRiM, the first datasets for aspect-aware mathematical paper recommendation, and presents AchGNN, an aspect-conditioned heterogeneous graph neural network that integrates textual semantics, citation networks, and author relationships. Experimental results show that AchGNN consistently outperforms prior recommendation methods across multiple aspects and also generalizes effectively to machine learning literature. The system has been deployed on the MaRDI platform An example document with recommendations to support mathematical research discovery.

This repository includes the complete source code (src), datasets (data), and supplementary materials (material).

Data

This repository includes scripts to obtain and prepare the two datasets:

  • SilverRiM: Automatically generated aspect-aware recommendations
  • GoldRiM: Gold standard manually curated recommendations

See data/README.md for detailed setup instructions and dataset descriptions.

Supplementary Materials

We offer supplementary materials in (material), such as a summary table of existing CbRPR datasets, alternative visualizations of the results, definitions of aspect labels, and additional plots.

Source Code

All source code is available in src. Any experiments, scripts, or other attempts to reproduce the data require to load the python virtual environment below.

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r requirements.txt

Reproducing Results

All results from the paper can be reproduced using the scripts below:

Figure/Table Script Description
Figure 3 python src/data_stats/n_gram_overlap.py N-gram overlap analysis
Figure 4 python src/data_stats/cosScores_overlap.py Cosine similarity scores (pre-computed; run src/eval/createIndexes.py for fresh scores)
Table 1 python src/data_stats/print_data_stats.py Dataset statistics (requires loaded dataframes from data/)
Table 2 bash src/get_eval_goldRiM_silverRiM.sh GoldRiM vs SilverRiM evaluation
Table 3 bash src/get_eval_pwc.sh Papers with Code evaluation
Figures 7–8 bash src/ablation_.sh Ablation study results

Citation

If you're using or referring to our paper in your research or applications, please cite using this BibTeX:

@inproceedings{Satpute2026,
	title     = {Aspect-Aware Content-Based Recommendations for Mathematical Research Papers},
	author    = {Satpute, Ankit and Greiner-Petter, Andre and Giessing, Noah and Teschke, Olaf and Schubotz, Moritz and Aizawa, Akiko and Gipp, Bela},
	year      = 2026,
	month     = {July.},
	booktitle = {Proceedings of 49th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’26)},
	publisher = {ACM},
	address   = {Melbourne | Naarm, Australia},
	topic     = {rec}
}

License

CC-BY-SA 4.0. The dataset includes non-copyrighted bibliographic metadata and reference data derived from I4OSC (CC0).

About

Aspect-Aware Content-Based Recommendations for Mathematical Research Papers

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors