Defragmenting Language Models: An Interpretability-based Approach for Vocabulary Expansion
arXiv cs.CL / 4/21/2026
📰 NewsModels & Research
Key Points
- The paper studies “token over-fragmentation” in modern open-weight LLMs, where languages with non-Latin scripts require multiple times more tokens than English to represent the same information.
- It proposes an interpretability-based vocabulary expansion approach that revisits two key choices: which vocabulary items to add and how to initialize their input/output embeddings.
- The authors argue against relying solely on frequency-based candidate selection and show interpretability-based methods achieve better performance-to-token-efficiency trade-offs.
- They report that interpretability-grounded embedding initialization can yield large gains (around 20 points) over baseline initialization methods for several non-Latin-script languages.
- Based on analysis of “subword detokenization,” the paper introduces FragMend to push efficiency further, validating it with comparisons to strong baselines and extensive ablation-style analysis.
Related Articles

Agent Package Manager (APM): A DevOps Guide to Reproducible AI Agents
Dev.to

3 Things I Learned Benchmarking Claude, GPT-4o, and Gemini on Real Dev Work
Dev.to

Open Source Contributors Needed for Skillware & Rooms (AI/ML/Python)
Dev.to
Production LLM systematically violates tool schema constraints to invent UI features; observed over ~2,400 messages [D]
Reddit r/MachineLearning
My AI system kept randomly switching to French mid-answer and it took me way too long to figure out why
Reddit r/artificial