UnIte: Uncertainty-based Iterative Document Sampling for Domain Adaptation in Information Retrieval
arXiv cs.AI / 4/29/2026
💬 OpinionIdeas & Deep AnalysisModels & Research
Key Points
- The paper introduces UnIte, an uncertainty-based iterative document sampling method to improve unsupervised domain adaptation for neural information retrieval models.
- It enhances pseudo query generation by filtering documents with high aleatoric uncertainty and prioritizing those with high epistemic uncertainty, targeting documents that maximize the current model’s learning utility.
- Compared with prior sampling approaches that mainly optimize diversity, UnIte more effectively captures model uncertainty to select better documents for adaptation.
- Experiments on the BEIR benchmark using both small and large models show substantial improvements in retrieval quality, reporting +2.45 and +3.49 nDCG@10 with only about 4k training samples on average.


