Steering Sparse Autoencoder Latents to Control Dynamic Head Pruning in Vision Transformers (Student Abstract)
arXiv cs.CV / 3/31/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper addresses the challenge that dynamic head pruning in Vision Transformers is typically hard to interpret and control with existing pruning policies.
- It proposes a framework that trains a Sparse Autoencoder (SAE) on the ViT’s final-layer residual embeddings and then uses amplified sparse latents to drive different pruning decisions.
- The approach supports “per-class steering,” which discovers compact, class-specific subsets of attention heads while maintaining accuracy.
- An example reported is a performance gain for the “bowl” class, improving accuracy from 76% to 82% while reducing head usage from 0.72 to 0.33 by pruning down to heads h2 and h5.
- The authors argue the method links pruning efficiency with mechanistic interpretability by making pruning behavior controllable through sparse, disentangled features.



