LiRA: A Multi-Agent Framework for Reliable and Readable Literature Review Generation

arXiv cs.CL / 3/23/2026

📰 NewsIdeas & Deep AnalysisModels & Research

共有:

Key Points

LiRA introduces a multi-agent workflow with specialized agents for outlining content, writing subsections, editing, and reviewing to automate literature-review generation.
It is evaluated on SciReviewGen and a proprietary ScienceDirect dataset, where LiRA outperforms baselines such as AutoSurvey and MASS-Survey in writing quality and citation quality while preserving similarity to human reviews.
The study demonstrates robustness to reviewer model variation and viability in real-world document retrieval scenarios, supporting the practical utility of agentic LLM workflows for scientific writing.
The results indicate that substantial improvements can be achieved without domain-specific tuning, suggesting LiRA's approach may scale to broader systematic-review tasks.

Abstract

The rapid growth of scientific publications has made it increasingly difficult to keep literature reviews comprehensive and up-to-date. Though prior work has focused on automating retrieval and screening, the writing phase of systematic reviews remains largely under-explored, especially with regard to readability and factual accuracy. To address this, we present LiRA (Literature Review Agents), a multi-agent collaborative workflow which emulates the human literature review process. LiRA utilizes specialized agents for content outlining, subsection writing, editing, and reviewing, producing cohesive and comprehensive review articles. Evaluated on SciReviewGen and a proprietary ScienceDirect dataset, LiRA outperforms current baselines such as AutoSurvey and MASS-Survey in writing and citation quality, while maintaining competitive similarity to human-written reviews. We further evaluate LiRA in real-world scenarios using document retrieval and assess its robustness to reviewer model variation. Our findings highlight the potential of agentic LLM workflows, even without domain-specific tuning, to improve the reliability and usability of automated scientific writing.

Is AI becoming a bubble, and could it end like the dot-com crash?

Reddit r/artificial

Externalizing State

Dev.to

I made a 'benchmark' where LLMs write code controlling units in a 1v1 RTS game.

Dev.to

My AI Does Not Have a Clock

Dev.to

How to settle on a coding LLM ? What parameters to watch out for ?

Reddit r/LocalLLaMA

LiRA: A Multi-Agent Framework for Reliable and Readable Literature Review Generation

Key Points

Abstract

Related Articles

Is AI becoming a bubble, and could it end like the dot-com crash?

Externalizing State

I made a 'benchmark' where LLMs write code controlling units in a 1v1 RTS game.

My AI Does Not Have a Clock

How to settle on a coding LLM ? What parameters to watch out for ?

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer