FineSteer: A Unified Framework for Fine-Grained Inference-Time Steering in Large Language Models
arXiv cs.AI / 4/20/2026
💬 OpinionTools & Practical UsageModels & Research
Key Points
- FineSteer is a new inference-time steering framework for large language models that aims to reduce issues like safety violations and hallucinations without updating model parameters.
- The framework splits steering into two stages—Subspace-guided Conditional Steering (SCS) to avoid unnecessary changes that would harm utility, and Mixture-of-Steering-Experts (MoSE) to produce query-specific steering vectors.
- SCS preserves general model utility by steering only when needed, rather than applying a rigid one-size-fits-all adjustment.
- MoSE improves effectiveness by modeling the multimodal nature of desirable behaviors and synthesizing fine-grained steering vectors tailored to each input.
- Experiments on safety and truthfulness benchmarks indicate FineSteer outperforms existing state-of-the-art approaches while maintaining minimal utility loss, and the authors provide released code.



