DeliberationBench: A Normative Benchmark for the Influence of Large Language Models on Users' Views

arXiv cs.AI / 3/12/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

共有:

Key Points

DeliberationBench is proposed as a normative benchmark for assessing the persuasive influence of large language models (LLMs) on users' beliefs, using deliberative opinion polling as the standard.
The authors demonstrate the approach with a preregistered randomized experiment involving 4,088 U.S. participants who discussed 65 policy proposals with six frontier LLMs.
Results indicate substantial influence from the tested LLMs on participants' opinions, and this influence is positively associated with net opinion shifts after deliberation, suggesting broadly epistemically desirable effects.
The analysis finds differential influence across topic areas, demographic subgroups, and model variants, highlighting nuanced patterns in how LLMs shape viewpoints.
The framework is presented as an evaluation and monitoring tool to ensure LLM influence remains aligned with democratically legitimate standards and preserves users’ autonomy in forming their views.

Abstract

As large language models (LLMs) become pervasive as assistants and thought partners, it is important to characterize their persuasive influence on users' beliefs. However, a central challenge is to distinguish "beneficial" from "harmful" forms of influence, in a manner that is normatively defensible and legitimate. We propose DeliberationBench, a benchmark for assessing LLM influence that takes the process of deliberative opinion polling as its standard. We demonstrate our approach in a preregistered randomized experiment in which 4,088 U.S. participants discussed 65 policy proposals with six frontier LLMs. Using opinion change data from four prior Deliberative Polls conducted by the Deliberative Democracy Lab, we find evidence that the tested LLMs' influence is substantial in magnitude and positively associated with the net opinion shifts following deliberation, suggesting that these models exert broadly epistemically desirable effects. We further explore differential influence between topic areas, demographic subgroups, and models. Our framework can function as an evaluation and monitoring tool, helping to ensure that the influence of LLMs remains consistent with democratically legitimate standards, and preserves users' autonomy in forming their views.

Hey dev.to community – sharing my journey with Prompt Builder, Insta Posts, and practical SEO

Dev.to

How to Build Passive Income with AI in 2026: A Developer's Practical Guide

Dev.to

The Research That Doesn't Exist

Dev.to

Jeff Bezos reportedly wants $100 billion to buy and transform old manufacturing firms with AI

TechCrunch

Krish Naik: AI Learning Path For 2026- Data Science, Generative and Agentic AI Roadmap

Dev.to

DeliberationBench: A Normative Benchmark for the Influence of Large Language Models on Users' Views

Key Points

Abstract

Related Articles

Hey dev.to community – sharing my journey with Prompt Builder, Insta Posts, and practical SEO

How to Build Passive Income with AI in 2026: A Developer's Practical Guide

The Research That Doesn't Exist

Jeff Bezos reportedly wants $100 billion to buy and transform old manufacturing firms with AI

Krish Naik: AI Learning Path For 2026- Data Science, Generative and Agentic AI Roadmap

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer