GoodPoint: Learning Constructive Scientific Paper Feedback from Author Responses

arXiv cs.AI / 4/15/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper proposes using LLMs to augment researchers by generating constructive, targeted, and actionable feedback on scientific papers rather than automating research without oversight.
It introduces a new author-centric evaluation approach (validity and author action) and releases the GoodPoint-ICLR dataset (19K ICLR papers) with reviewer feedback annotated using author responses.
It presents the GoodPoint training recipe that fine-tunes on feedback judged both valid and actionable, and uses preference optimization on real and synthetic preference pairs derived from author responses.
Experiments on a 1.2K-paper benchmark show a GoodPoint-trained Qwen3-8B improves predicted success rate by 83.7% over the base model and achieves new state-of-the-art results for feedback matching among similarly sized LLMs.
A human expert study further supports that GoodPoint feedback is perceived as more practically valuable by authors than alternatives, indicating real-world usefulness.

Abstract

While LLMs hold significant potential to transform scientific research, we advocate for their use to augment and empower researchers rather than to automate research without human oversight. To this end, we study constructive feedback generation, the task of producing targeted, actionable feedback that helps authors improve both their research and its presentation. In this work, we operationalize the effectiveness of feedback along two author-centric axes-validity and author action. We first curate GoodPoint-ICLR, a dataset of 19K ICLR papers with reviewer feedback annotated along both dimensions using author responses. Building on this, we introduce GoodPoint, a training recipe that leverages success signals from author responses through fine-tuning on valid and actionable feedback, together with preference optimization on both real and synthetic preference pairs. Our evaluation on a benchmark of 1.2K ICLR papers shows that a GoodPoint-trained Qwen3-8B improves the predicted success rate by 83.7% over the base model and sets a new state-of-the-art among LLMs of similar size in feedback matching on a golden human feedback set, even surpassing Gemini-3-flash in precision. We further validate these findings through an expert human study, demonstrating that GoodPoint consistently delivers higher practical value as perceived by authors.

💡 Insights using this article

This article is featured in our daily AI news digest — key takeaways and action items at a glance.

📅 4/15DailyView insight →

Which Version of Qwen 3.6 for M5 Pro 24g

Reddit r/LocalLLaMA

From Theory to Reality: Why Most AI Agent Projects Fail (And How Mine Did Too)

Dev.to

GPT-5.4-Cyber: OpenAI's Game-Changer for AI Security and Defensive AI

Dev.to

Building Digital Souls: The Brutal Reality of Creating AI That Understands You Like Nobody Else

Dev.to

Local LLM Beginner’s Guide (Mac - Apple Silicon)

Reddit r/artificial

GoodPoint: Learning Constructive Scientific Paper Feedback from Author Responses

Key Points

Abstract

💡 Insights using this article

Related Articles

Which Version of Qwen 3.6 for M5 Pro 24g

From Theory to Reality: Why Most AI Agent Projects Fail (And How Mine Did Too)

GPT-5.4-Cyber: OpenAI's Game-Changer for AI Security and Defensive AI

Building Digital Souls: The Brutal Reality of Creating AI That Understands You Like Nobody Else

Local LLM Beginner’s Guide (Mac - Apple Silicon)

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer