Linear-Readout Floors and Threshold Recovery in Computation in Superposition

arXiv cs.LG / 5/5/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper compares two recent approaches to computation in superposition—Hanni et al.’s approximate-linear recursive template and Adler & Shavit’s thresholded Boolean recovery—and argues they are consistent because they preserve different interface invariants.
  • It establishes a Welch-type lower bound for biorthogonal linear readouts, showing that when the number of features F is much larger than the width d, the worst-case off-diagonal cross-talk scales as Ω(d^{-1/2}).
  • At a quadratic feature load (F = d^2), the authors show random-support threshold recovery can succeed for sparsities s = O(d/log d), whereas linear readouts still suffer average per-coordinate squared error Ω(s/d) on Bernoulli sparse states.
  • By matching the Welch lower bound to the published tolerance of Hanni’s correction layer, the paper explains why the computable-feature scale d^{3/2} appears as a compatibility threshold for that specific template rather than a universal upper limit.
  • The authors note that designing robust nonlinear reset methods beyond the Hanni template remains an open problem.
  • Point 2
  • Point 3

Abstract

Two recent approaches to computation in superposition reach different recursive capacity regimes: H\"anni et al. certify \tilde{O}(d^{3/2}) computable features in width d via an approximate-linear recursive template, while Adler and Shavit reach near-quadratic capacity (up to logarithmic factors) using thresholded Boolean recovery. The main contribution of this paper is conceptual: we argue these results are not contradictory because they maintain different interface invariants, and we formalize the distinction. As a tool, we record a rank-trace Welch-type lower bound for biorthogonal linear readouts: for F \gg d, the worst-case off-diagonal cross-talk of any unit-diagonal linear readout is \Omega(d^{-1/2}), and the bound is tight on average for unit-norm tight frames. At quadratic feature load F=d^2, random-support threshold recovery succeeds for sparsities s=O(d/\log d), while linear readouts still incur \Omega(s/d) average per-coordinate squared error on Bernoulli sparse states. Matching the Welch floor against the published tolerance of the H\"anni correction layer explains the d^{3/2} scale as a compatibility threshold for that template, not a universal upper bound. Robust nonlinear reset beyond the H\"anni template is left open.