A survey of diversity quantification in natural language processing: The why, what, where and how
arXiv cs.CL / 3/16/2026
💬 OpinionIdeas & Deep AnalysisModels & Research
Key Points
- The paper notes fragmentation and inconsistencies in how NLP papers quantify diversity and calls for a unified approach.
- It adopts Stirling's three diversity dimensions—variety, balance, and disparity—and maps them into an NLP-specific framework.
- It surveys over 300 diversity-related NLP papers from ACL Anthology and organizes the analysis around four perspectives: why diversity matters, what is measured, where it is measured, and how it is measured.
- The authors aim to improve comparability across methods, reveal emerging trends, and provide recommendations to guide future research in the field.

