ReviewGrounder: Improving Review Substantiveness with Rubric-Guided, Tool-Integrated Agents
arXiv cs.CL / 4/17/2026
📰 NewsDeveloper Stack & InfrastructureTools & Practical UsageModels & Research
Key Points
- The paper identifies why LLM-based peer review support can produce superficial, formulaic feedback: it underuses explicit rubrics and contextual grounding in relevant work.
- It introduces REVIEWBENCH, a benchmark that scores review texts against paper-specific rubrics created from official guidelines, the paper content, and human-written reviews.
- It proposes REVIEWGROUNDER, a rubric-guided, tool-integrated multi-agent system that splits reviewing into drafting and evidence-grounding stages to improve depth.
- Experiments on REVIEWBENCH show REVIEWGROUNDER produces higher-quality reviews aligned with human judgments across eight rubric dimensions, even using smaller backbones than some strong baseline models.
- The authors provide the code publicly on GitHub for reproducibility and further development.



![[Patterns] AI Agent Error Handling That Actually Works](/_next/image?url=https%3A%2F%2Fmedia2.dev.to%2Fdynamic%2Fimage%2Fwidth%3D1200%2Cheight%3D627%2Cfit%3Dcover%2Cgravity%3Dauto%2Cformat%3Dauto%2Fhttps%253A%252F%252Fdev-to-uploads.s3.amazonaws.com%252Fuploads%252Farticles%252Frn5czaopq2vzo7cglady.png&w=3840&q=75)