HubRouter: A Pluggable Sub-Quadratic Routing Primitive for Hybrid Sequence Models
arXiv cs.LG / 4/27/2026
📰 NewsModels & Research
Key Points
- HubRouter is a new pluggable routing primitive that replaces O(n^2) attention with an O(nM) hub-mediated mechanism using a small set of learned hub tokens (M << n).
- It uses an encode–decode–score–council pipeline where hub tokens attend to all tokens, tokens compute routing fingerprints against hubs, a score head selects top-k, and a sparse council attends only to the selected subset.
- The paper evaluates HubRouter in multiple architectures (a Jamba-style hybrid and a 12-layer Transformer) and finds that fully retrofitting pretrained models was a tested negative case.
- Results show modest perplexity gains or tradeoffs depending on the setup: Hub-Jamba gives a nominal ~4.2% PPL improvement and large training-throughput gains, a 25% gradual replacement in Transformers improves matched-budget perplexity, and Hub-GPT is strictly causal but costs quality to avoid O(n^2) compute.
- Experiments and sweeps suggest a reliable hub-count range of M=8–14, with M>=20 showing greater sensitivity across random seeds, and the authors plan to release code and scripts.
Related Articles

Subagents: The Building Block of Agentic AI
Dev.to

DeepSeek-V4 Models Could Change Global AI Race
AI Business

Got OpenAI's privacy filter model running on-device via ExecuTorch
Reddit r/LocalLLaMA

The Agent-Skill Illusion: Why Prompt-Based Control Fails in Multi-Agent Business Consulting Systems
Dev.to
We Built a Voice AI Receptionist in 8 Weeks — Every Decision We Made and Why
Dev.to