The Impact of Ideological Discourses in RAG: A Case Study with COVID-19 Treatments

arXiv cs.CL / 3/17/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper examines how retrieved ideological texts influence the outputs of large language models (LLMs) within a Retrieval-Augmented Generation (RAG) framework using a corpus of 1,117 academic articles on COVID-19 treatments.
It introduces a corpus linguistics approach based on Lexical Multidimensional Analysis (LMDA) to identify ideologies in the external texts and applies prompts with and without LMDA-derived descriptions to elicit LLM responses.
The results show that LLM outputs align more with the ideology present in the retrieved texts, and using an enhanced prompt further increases this alignment, highlighting potential ideological bias propagation in RAG systems.
The study discusses risks of ideological manipulation and emphasizes the need to identify and mitigate ideological discourses within RAG to reduce bias and manipulation in AI models.

Abstract

This paper studies the impact of retrieved ideological texts on the outputs of large language models (LLMs). While interest in understanding ideology in LLMs has recently increased, little attention has been given to this issue in the context of Retrieval-Augmented Generation (RAG). To fill this gap, we design an external knowledge source based on ideological loaded texts about COVID-19 treatments. Our corpus is based on 1,117 academic articles representing discourses about controversial and endorsed treatments for the disease. We propose a corpus linguistics framework, based on Lexical Multidimensional Analysis (LMDA), to identify the ideologies within the corpus. LLMs are tasked to answer questions derived from three identified ideological dimensions, and two types of contextual prompts are adopted: the first comprises the user question and ideological texts; and the second contains the question, ideological texts, and LMDA descriptions. Ideological alignment between reference ideological texts and LLMs' responses is assessed using cosine similarity for lexical and semantic representations. Results demonstrate that LLMs' responses based on ideological retrieved texts are more aligned with the ideology encountered in the external knowledge, with the enhanced prompt further influencing LLMs' outputs. Our findings highlight the importance of identifying ideological discourses within the RAG framework in order to mitigate not just unintended ideological bias, but also the risks of malicious manipulation of such models.

Hey dev.to community – sharing my journey with Prompt Builder, Insta Posts, and practical SEO

Dev.to

How to Build Passive Income with AI in 2026: A Developer's Practical Guide

Dev.to

The Research That Doesn't Exist

Dev.to

Jeff Bezos reportedly wants $100 billion to buy and transform old manufacturing firms with AI

TechCrunch

Krish Naik: AI Learning Path For 2026- Data Science, Generative and Agentic AI Roadmap

Dev.to

The Impact of Ideological Discourses in RAG: A Case Study with COVID-19 Treatments

Key Points

Abstract

Related Articles

Hey dev.to community – sharing my journey with Prompt Builder, Insta Posts, and practical SEO

How to Build Passive Income with AI in 2026: A Developer's Practical Guide

The Research That Doesn't Exist

Jeff Bezos reportedly wants $100 billion to buy and transform old manufacturing firms with AI

Krish Naik: AI Learning Path For 2026- Data Science, Generative and Agentic AI Roadmap

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer