Exploring the System 1 Thinking Capability of Large Reasoning Models

arXiv cs.CL / 5/4/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper investigates “System 1 thinking” in Large Reasoning Models (LRMs), focusing on their ability to answer intuitively and efficiently using minimal tokens.
  • It introduces S1-Bench, a multi-domain and multilingual benchmark designed for model-simple System 1 questions.
  • Experiments across 28 LRMs show that they tend to be both less accurate and less efficient on System 1-style problems than expected.
  • The study finds that current efficient reasoning techniques may not generalize well to simple questions, and may trade off accuracy to achieve efficiency.
  • The authors observe early difficulty awareness in LRMs, with lower confidence, and suggest that difficulty is implicitly represented in hidden states.

Abstract

This paper explores the system 1 thinking capability of Large Reasoning Models (LRMs), the intuitive ability to respond efficiently with minimal token usage. While existing LRMs rely on long-chain reasoning and excel at complex tasks, their system 1 thinking ability remains largely underexplored. This capability is essential as it reflects models' difficulty awareness and reasoning efficiency, both critical for real-world applications. We propose S1-Bench, a multi-domain, multilingual benchmark comprising model-simple system 1 questions. Our investigation of 28 LRMs reveals under-accuracy and inefficiency on system 1 problems. We find existing efficient reasoning methods either generalize poorly to simple questions or sacrifice performance for efficiency. Further exploration uncovers LRMs' early difficulty awareness accompanied by lower confidence, and shows that problem difficulty is implicitly encoded in hidden states.