Nautile-370M: Spectral Memory Meets Attention in a Small Reasoning Model
arXiv cs.LG / 4/29/2026
📰 NewsDeveloper Stack & InfrastructureIdeas & Deep AnalysisModels & Research
Key Points
- Nautile-370M is a newly introduced 371M-parameter small language model built to perform efficient reasoning within tight parameter and inference budgets.
- Its architecture alternates two SeqCond Attention (SCA) layers—using a linear-time spectral sequence operator—with one standard transformer layer to balance long-context/state tracking with attention-like routing.
- The authors report training on limited compute: a single Google TPU v4-64 pod slice via TPU Research Cloud (TRC), followed by a reinforcement learning stage on a single NVIDIA DGX Spark.
- The paper provides a theoretical result that SCA can exactly retrieve individual tokens from prefix summaries and can emulate softmax attention, arguing SCA is at least as expressive as full self-attention in the continuous limit.
- It also outlines a dedicated training data pipeline and proposes a reinforcement learning stage tailored to reasoning, verification, and response quality.
Related Articles

How I Use AI Agents to Maintain a Living Knowledge Base for My Team
Dev.to

An API testing tool built specifically for AI agent loops
Dev.to
IK_LLAMA now supports Qwen3.5 MTP Support :O
Reddit r/LocalLLaMA
OpenAI models, Codex, and Managed Agents come to AWS
Dev.to

Indian Developers: How to Build AI Side Income with $0 Capital in 2026
Dev.to