LACE: Lattice Attention for Cross-thread Exploration

arXiv cs.AI / 4/20/2026

📰 NewsModels & Research

Key Points

  • The paper introduces LACE, a framework that turns otherwise independent parallel reasoning attempts into a coordinated process using cross-thread attention.
  • By modifying the model’s architecture to let reasoning “threads” attend to and share intermediate insights, the method aims to reduce repeated redundant failures during inference.
  • A key obstacle is the lack of real training data showing collaborative reasoning between parallel paths, which the authors address with a synthetic data pipeline.
  • Experiments report that LACE improves reasoning accuracy by over 7 points compared with standard parallel search approaches.
  • The work suggests that enabling interaction among parallel reasoning paths can make large language models more effective than treating each path as isolated.

Abstract

Current large language models reason in isolation. Although it is common to sample multiple reasoning paths in parallel, these trajectories do not interact, and often fail in the same redundant ways. We introduce LACE, a framework that transforms reasoning from a collection of independent trials into a coordinated, parallel process. By repurposing the model architecture to enable cross-thread attention, LACE allows concurrent reasoning paths to share intermediate insights and correct one another during inference. A central challenge is the absence of natural training data that exhibits such collaborative behavior. We address this gap with a synthetic data pipeline that explicitly teaches models to communicate and error-correct across threads. Experiments show that this unified exploration substantially outperforms standard parallel search, improving reasoning accuracy by over 7 points. Our results suggest that large language models can be more effective when parallel reasoning paths are allowed to interact.