| Qwen Team released Qwen-Scope — a collection of Sparse Autoencoders (SAEs) for the Qwen 3.5 family (from 2B to 35B MoE). They’ve mapped internal features for the residual stream across all layers. What is this exactly? Think of it as a dictionary of the model's internal concepts. Instead of looking at raw numbers, you can see specific "features" that represent concepts like "legal talk", "Python code", or "refusal". What can you do with this?
How it works in practice (Space demo example):
Space: https://huggingface.co/spaces/Qwen/Qwen-Scope Technical Report: https://qianwen-res.oss-accelerate.aliyuncs.com/qwen-scope/Qwen_Scope.pdf [link] [comments] |
Qwen-Scope: Official Sparse Autoencoders (SAEs) for Qwen 3.5 models
Reddit r/LocalLLaMA / 4/30/2026
📰 NewsIdeas & Deep AnalysisTools & Practical UsageModels & Research
Key Points
- Qwen Team released Qwen-Scope, an open collection of sparse autoencoders (SAEs) that map interpretable internal features in the residual stream of Qwen 3.5 models from 2B to 35B MoE.
- The release provides a “dictionary” view of model concepts (e.g., refusal, legal-domain language, Python code, style-related features) and tools to identify which feature IDs activate on specific inputs.
- Qwen-Scope enables applications such as “surgical ablation” (suppressing a targeted feature), feature steering (amplifying or forcing certain concepts during generation), and debugging behaviors like unexpected language switching.
- It also supports dataset and fine-tuning analysis by checking whether training examples actually trigger the intended internal features.
- The team discourages using the tools to remove safety filters or otherwise interfere with model capabilities, even though the technical capability to do so is what the feature controls enable.
Related Articles

Black Hat USA
AI Business
Vector DB and ANN vs PHE conflict, is there a practical workaround? [D]
Reddit r/MachineLearning

Agent Amnesia and the Case of Henry Molaison
Dev.to

Azure Weekly: Microsoft and OpenAI Restructure Partnership as GPT-5.5 Lands in Foundry
Dev.to

Proven Patterns for OpenAI Codex in 2026: Prompts, Validation, and Gateway Governance
Dev.to