Survey Response Generation: Generating Closed-Ended Survey Responses In-Silico with Large Language Models

arXiv cs.CL / 4/27/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper examines how different Survey Response Generation (SRG) methods affect the quality of closed-ended survey responses generated in-silico by LLMs, despite LLMs being trained primarily for open-ended text.
  • Using 32 million simulated responses, the study compares 8 SRG methods across 4 political attitude survey tasks and 10 open-weight language models.
  • The results show that SRG method choice leads to significant differences in alignment at both the individual level and the subpopulation level.
  • Restricted Generation Methods deliver the best overall performance, while providing reasoning output does not reliably improve alignment.
  • The authors offer practical recommendations for selecting and applying SRG methods when using LLMs to simulate survey responses.

Abstract

Many in-silico simulations of human survey responses with large language models (LLMs) focus on generating closed-ended survey responses, whereas LLMs are typically trained to generate open-ended text instead. Previous research has used a diverse range of methods for generating closed-ended survey responses with LLMs, and a standard practice remains to be identified. In this paper, we systematically investigate the impact that various Survey Response Generation Methods have on predicted survey responses. We present the results of 32 mio. simulated survey responses across 8 Survey Response Generation Methods, 4 political attitude surveys, and 10 open-weight language models. We find significant differences between the Survey Response Generation Methods in both individual-level and subpopulation-level alignment. Our results show that Restricted Generation Methods perform best overall, and that reasoning output does not consistently improve alignment. Our work underlines the significant impact that Survey Response Generation Methods have on simulated survey responses, and we develop practical recommendations on the application of Survey Response Generation Methods.