The Cognitive Penalty: Ablating System 1 and System 2 Reasoning in Edge-Native SLMs for Decentralized Consensus

arXiv cs.AI / 4/21/2026

📰 NewsDeveloper Stack & InfrastructureSignals & Early TrendsIndustry & Market MovesModels & Research

Key Points

  • The paper studies how “System 1” (autoregessive) versus “System 2” (inference-time reasoning) affects robustness and consensus in edge-native small language models used for DAO proposal vetting.
  • It introduces Sentinel-Bench, an 840-inference evaluation that performs intra-model ablations on Qwen-3.5-9B with frozen weights while varying latent reasoning under an adversarial Optimism DAO dataset.
  • Results show a compute–accuracy inversion: the System 1 baseline achieved 100% adversarial robustness and juridical consistency with state finality in under 13 seconds, while System 2 reasoning caused catastrophic instability.
  • The instability is attributed to a 26.7% reasoning non-convergence (“cognitive collapse”) rate, which reduced trial-to-trial consensus stability to 72.6% and added a 17× latency overhead.
  • The study also observes rare (1.5%) “reasoning-induced sycophancy,” where the model produces very long internal monologues (about 25,750 characters) to rationalize failures, creating additional governance vulnerabilities and risking hardware centralization.

Abstract

Decentralized Autonomous Organizations (DAOs) are inclined explore Small Language Models (SLMs) as edge-native constitutional firewalls to vet proposals and mitigate semantic social engineering. While scaling inference-time compute (System 2) enhances formal logic, its efficacy in highly adversarial, cryptoeconomic governance environments remains underexplored. To address this, we introduce Sentinel-Bench, an 840-inference empirical framework executing a strict intra-model ablation on Qwen-3.5-9B. By toggling latent reasoning across frozen weights, we isolate the impact of inference-time compute against an adversarial Optimism DAO dataset. Our findings reveal a severe compute-accuracy inversion. The autoregressive baseline (System 1) achieved 100% adversarial robustness, 100% juridical consistency, and state finality in under 13 seconds. Conversely, System 2 reasoning introduced catastrophic instability, fundamentally driven by a 26.7% Reasoning Non-Convergence (cognitive collapse) rate. This collapse degraded trial-to-trial consensus stability to 72.6% and imposed a 17x latency overhead, introducing critical vulnerabilities to Governance Extractable Value (GEV) and hardware centralization. While rare (1.5% of adversarial trials), we empirically captured "Reasoning-Induced Sycophancy," where the model generated significantly longer internal monologues (averaging 25,750 characters) to rationalize failing the adversarial trap. We conclude that for edge-native SLMs operating under Byzantine Fault Tolerance (BFT) constraints, System 1 parameterized intuition is structurally and economically superior to System 2 iterative deliberation for decentralized consensus. Code and Dataset: https://github.com/smarizvi110/sentinel-bench