ReLay: Personalized LLM-Generated Plain-Language Summaries for Better Understanding, but at What Cost?

arXiv cs.CL / 5/4/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • Plain-language summaries (PLS) are intended to make research accessible, but existing one-size-fits-all versions often fail to match individual readers’ needs—especially risky in health contexts where misunderstandings can affect real decisions.
  • The paper introduces ReLay, a dataset containing 300 participant–PLS pairs from 50 lay participants, covering both static expert-written summaries and interactive LLM-personalized summaries.
  • Evaluating five LLMs with two personalization approaches, the study finds that personalization improves comprehension and perceived quality.
  • However, personalization can also increase the risk of reinforcing user biases and producing hallucinations, creating a clear trade-off between personalization benefits and safety/trustworthiness.
  • The results suggest that future personalization methods should be designed to improve understanding while mitigating bias and hallucination risks for diverse lay audiences.

Abstract

Plain Language Summaries (PLS) aim to make research accessible to lay readers, but they are typically written in a one-size-fits-all style that ignores differences in readers' information needs and comprehension. In health contexts, this limitation is particularly important because misunderstanding scientific information can affect real-world decisions. Large language models (LLMs) offer new opportunities for personalizing PLS, but it remains unclear whether personalization helps, which strategies are most effective, and how to balance personalization with safety. We introduce ReLay, a dataset of 300 participant--PLS pairs from 50 lay participants in both static (expert-written) and interactive (LLM-personalized) settings. ReLay includes user characteristics, health information needs, information-seeking behavior, comprehension outcomes, interaction logs, and quality ratings. We use ReLay to evaluate five LLMs across two personalization methods. Personalization improves comprehension and perceived quality, but it also raises the risk of reinforcing user biases and introducing hallucinations, revealing a trade-off between personalization and safety. These findings highlight the need for personalization methods that are both effective and trustworthy for diverse lay audiences.