CAWN: Continuous Acoustic Wave Networks for Autoregressive Language Modeling

arXiv cs.CL / 4/7/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper proposes CAWN, a fully continuous sequence-mixing architecture for autoregressive language modeling that replaces transformer attention with multi-headed complex-domain phasors and causal phase accumulation for O(L) scaling.
To address long-context signal degradation seen in some linear-time sequence models, CAWN adds a dual-gated Selective Phase Resonance mechanism with frequency-dependent retention, hard-threshold gating, and a Temporal Syntax Cache for short-term dependencies.
It improves spatial/feature mixing by using depth-wise harmonic convolutions instead of standard dense projections, and it adds Block Attention Residuals for depth-wise state routing.
A 150M-parameter prototype is trained on a 100B-token corpus using continuous streaming, evaluated at a 5B-token milestone, and reportedly supports targeted retrieval across 2,000,000 tokens with strict VRAM plateauing at 8.72 GB via O(1) chunked prefill state passing.
The authors report empirical benefits using a Targeted Semantic Retrieval protocol, including robust vocabulary acquisition and extended contextual denoising.

Abstract

Modern Large Language Models (LLMs) rely on Transformer self-attention, which scales quadratically with sequence length. Recent linear-time alternatives, like State Space Models (SSMs), often suffer from signal degradation over extended contexts. We introduce the Continuous Acoustic Wave Network (CAWN), a fully continuous sequence-mixing architecture. Instead of discrete matrix-based attention, CAWN projects hidden states into multi-headed complex-domain phasors, achieving sequence mixing through a causal,

O(L)

Phase Accumulation mechanism. To prevent signal degradation over ultra-long contexts, we introduce a dual-gated Selective Phase Resonance mechanism incorporating Frequency-Dependent Retention, Hard-Threshold Gating via Straight-Through Estimation, and a Temporal Syntax Cache to capture short-term local dependencies. We also replace standard dense linear projections with Depth-wise Harmonic Convolutions for optimal spatial frequency mixing, augmented by Block Attention Residuals for depth-wise state routing. Scaled to a 150M-parameter model, CAWN utilizes custom Triton kernels for hardware-efficient, true-complex phase accumulation in float32. Trained via a continuous streaming loop on a 100-Billion-token corpus, the prototype is evaluated at a 5-Billion-token milestone. Empirical evaluations via a Targeted Semantic Retrieval protocol demonstrate robust vocabulary acquisition and extended explicitly learned contextual denoising. By leveraging

O(1)

state-passing via chunked prefill, the model retrieves targeted information across 2,000,000 tokens while strictly plateauing at 8.72 GB of Peak VRAM, empirically overcoming the

O(L^2)

context memory wall.

Black Hat Asia

AI Business

v0.20.5

Ollama Releases

Inside Anthropic's Project Glasswing: The AI Model That Found Zero-Days in Every Major OS

Dev.to

Gemma 4 26B fabricated an entire code audit. I have the forensic evidence from the database.

Reddit r/LocalLLaMA

SoloEngine: Low-Code Agentic AI Development Platform with Native Support for Multi-Agent Collaboration, MCP, and Skill System

Dev.to

CAWN: Continuous Acoustic Wave Networks for Autoregressive Language Modeling

Key Points

Abstract

Related Articles

Black Hat Asia

v0.20.5

Inside Anthropic's Project Glasswing: The AI Model That Found Zero-Days in Every Major OS

Gemma 4 26B fabricated an entire code audit. I have the forensic evidence from the database.

SoloEngine: Low-Code Agentic AI Development Platform with Native Support for Multi-Agent Collaboration, MCP, and Skill System

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer