| Per https://paperreview.ai/tech-overview, the scores corr between 2 human is about 0.41 for ICLR 2025, but in my current project I am seeing a much lower corr for ICLR 2026. So I ran the metrics for both 2025 and 2026 and it is crazy. I used 2 metrics, one-vs-rest corr and half-half split corr. All data are fetched from OpenReview. I do know that top conf reviews are just a lottery now for most papers, but i nenver thought it is this bad. 2025 avg-score SD: 1.253, mean wavg-scoreer human SD: 1.186 2026 avg-score SD: 1.162, mean within-paper human SD: 1.523
[link] [comments] |
Just did an analysis on ICLR 2025 vs 2026 scores and WOW [D]
Reddit r/MachineLearning / 4/12/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisTools & Practical Usage
Key Points
- The post analyzes ICLR 2025 versus ICLR 2026 review score patterns using reviewer-score correlation metrics drawn from OpenReview data.
- It claims the correlation between reviews by different humans was about 0.41 for ICLR 2025, but observed a much lower correlation for ICLR 2026.
- The author reports that score dispersion differs between years, noting ICLR 2025 has an avg-score standard deviation of 1.253 and ICLR 2026 has 1.162.
- For within-paper human agreement, the reported metric is higher in 2026 (within-paper human SD 1.523) than 2025 (human SD 1.186), suggesting greater variability in review opinions.
- The author concludes that acceptance/review outcomes may be “lottery-like” for many papers and expresses surprise at the magnitude of the year-to-year change.
Related Articles

Black Hat USA
AI Business

Black Hat Asia
AI Business
AI Agents Explained: 5 Types, Components, Frameworks, and Real-World Use Cases
Dev.to
Build Your Own JARVIS: A Deep Dive into Memo AI - The Privacy-First Local Voice Agent
Dev.to
Edge-to-Cloud Swarm Coordination for circular manufacturing supply chains with embodied agent feedback loops
Dev.to