Was looking at a ICLR 2025 Oral paper and I am shocked it got oral [D]

Reddit r/MachineLearning / 4/15/2026

💬 OpinionSignals & Early TrendsIdeas & Deep Analysis

Key Points

  • A Reddit user reviewed an ICLR 2025 oral paper and claims it evaluated LLM-based SQL code generation using a natural-language metric rather than execution-based validation.
  • The user reports that the authors found an approximate 20% false positive rate in their testing and argues this is a major methodological flaw.
  • The post questions how the paper was accepted for an oral presentation despite the alleged evaluation issue.
  • The discussion links to the OpenReview entry for the paper for readers to inspect the review and methods themselves.

After my last post about score analysis of ICLR, I am looking into the review itself now.

They evaled SQL code generation by LLM using nature language metric and not executation metric, and they tested it and found around 20% false positive rate. This is a major flaw how is it even getting oral?

https://openreview.net/forum?id=GGlpykXDCa

submitted by /u/Striking-Warning9533
[link] [comments]