Where Should LoRA Go? Component-Type Placement in Hybrid Language Models
arXiv cs.LG / 4/27/2026
💬 OpinionModels & Research
Key Points
- The study argues that standard LoRA practices that apply adapters uniformly are suboptimal for hybrid language models, because different component types (attention vs recurrent/SSM) play distinct functional roles.
- Experiments on Qwen3.5-0.8B and Falcon-H1-0.5B show that placing LoRA on the attention pathway—though it is a smaller component—consistently yields better performance than adapting the full model while using 5–10× fewer trainable parameters.
- Adapting the recurrent backbone has architecture-dependent effects: it is destructive in sequential hybrids (e.g., −14.8 pp on GSM8K) but constructive in parallel hybrids (+8.6 pp).
- The authors also find transfer asymmetry, with parallel hybrids benefiting from positive cross-task transfer while sequential hybrids experience catastrophic forgetting.
- Overall, the paper concludes that hybrid topology fundamentally changes how adaptation responds, making component-aware LoRA placement an essential design consideration for hybrid architectures.
Related Articles

Subagents: The Building Block of Agentic AI
Dev.to

DeepSeek-V4 Models Could Change Global AI Race
AI Business

Got OpenAI's privacy filter model running on-device via ExecuTorch
Reddit r/LocalLLaMA

The Agent-Skill Illusion: Why Prompt-Based Control Fails in Multi-Agent Business Consulting Systems
Dev.to

We Built a Voice AI Receptionist in 8 Weeks — Every Decision We Made and Why
Dev.to