How Open Must Language Models be to Enable Reliable Scientific Inference?

arXiv cs.CL / 3/30/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper examines how the openness or closedness of language models affects the reliability of scientific inferences derived from research using those models.
  • It argues that restrictions on information about model construction and deployment can introduce threats to scientific inference, making many closed models poorly suited for scientific applications.
  • The authors note exceptions where some closed models may still support scientific purposes, but they generally emphasize the risk of unverifiable or non-reproducible behavior.
  • They propose mitigation approaches and recommend that researchers systematically identify inference threats, document mitigation steps, and provide explicit justifications for choosing a specific model.

Abstract

How does the extent to which a model is open or closed impact the scientific inferences that can be drawn from research that involves it? In this paper, we analyze how restrictions on information about model construction and deployment threaten reliable inference. We argue that current closed models are generally ill-suited for scientific purposes, with some notable exceptions, and discuss ways in which the issues they present to reliable inference can be resolved or mitigated. We recommend that when models are used in research, potential threats to inference should be systematically identified along with the steps taken to mitigate them, and that specific justifications for model selection should be provided.