Submodular Benchmark Selection
arXiv cs.AI / 5/5/2026
📰 NewsIdeas & Deep AnalysisModels & Research
Key Points
- The paper addresses the high cost of evaluating large language models across many correlated benchmarks by framing benchmark subset selection as submodular maximization under a multivariate Gaussian assumption.
- It derives natural objectives based on entropy (log-determinant covariance) and mutual information between selected and remaining benchmarks, showing that entropy is submodular and connects to pivoted Cholesky with spectral residual bounds.
- It finds that mutual information is generally non-monotone, but is empirically monotone for small subsets, enabling greedy optimization.
- Experiments using three matrices derived from ten public leaderboards indicate that mutual-information-based selection performs better than entropy-based selection for imputation when selecting small subsets.
Related Articles

When Claims Freeze Because a Provider Record Drifted: The Case for Enrollment Repair Agents
Dev.to

The Cash Is Already Earned: Why Construction Pay Application Exceptions Fit an Agent Better Than SaaS
Dev.to

Why Ship-and-Debit Claim Recovery Is a Better Agent Wedge Than Another “AI Back Office” Tool
Dev.to
AI is getting better at doing things, but still bad at deciding what to do?
Reddit r/artificial

I Built an AI-Powered Chinese BaZi (八字) Fortune Teller — Here's What DeepSeek Revealed About Destiny
Dev.to