Automatic Inter-document Multi-hop Scientific QA Generation
arXiv cs.CL / 3/17/2026
📰 NewsTools & Practical UsageModels & Research
Key Points
- AIM-SciQA is a new automated framework for generating inter-document, multi-hop scientific QA datasets.
- It uses large language models for single-hop QAs with machine reading comprehension and builds cross-document relations through embedding-based semantic alignment and selective citation information.
- Applied to 8,211 PubMed Central papers, it yields 411,409 single-hop QAs and 13,672 multi-hop QAs, forming the IM-SciQA dataset, with a citation-guided CIM-SciQA variant achieving comparable performance to the Oracle setting.
- Validation by human and automatic metrics confirms high factual consistency and shows the dataset effectively differentiates retrieval and QA reasoning, providing a realistic benchmark for retrieval-augmented scientific reasoning.
- The approach is extensible beyond PubMed Central, reinforcing the dataset's validity and generality across corpora.




