From Local to Global to Mechanistic: An iERF-Centered Unified Framework for Interpreting Vision Models
arXiv cs.CV / 5/4/2026
📰 NewsIdeas & Deep AnalysisModels & Research
Key Points
- The paper proposes an iERF-centered interpretability framework that unifies local, global, and mechanistic explanations of vision models using a single analysis unit: the pointwise feature vector (PFV) plus its instance-specific effective receptive field (iERF).
- It introduces Sharing Ratio Decomposition (SRD) to express each PFV as a mixture of upstream PFVs, propagating iERFs to produce activation-faithful, class-discriminative saliency maps that are robust to manipulations and noise.
- For global interpretability, it presents Concept-Anchored Feature Explanation (CAFE), using the iERF to semantically label latent vectors and ground sparse autoencoder features in verifiable pixel-level evidence.
- To explain how concepts are composed across network depth, it proposes the Interlayer Concept Graph with Interlayer Concept Attribution (ICAT), and uses an interlayer insertion/deletion protocol to identify Integrated Gradients as the most faithful attribution instantiation.
- Experiments across ResNet50, VGG16, and Vision Transformers show improved fidelity and robustness over baselines, including for dispersed SAE features, and the framework highlights dominant concept routes in correct, incorrect, and adversarial cases.
Related Articles

When Claims Freeze Because a Provider Record Drifted: The Case for Enrollment Repair Agents
Dev.to

The Cash Is Already Earned: Why Construction Pay Application Exceptions Fit an Agent Better Than SaaS
Dev.to

Why Ship-and-Debit Claim Recovery Is a Better Agent Wedge Than Another “AI Back Office” Tool
Dev.to
AI is getting better at doing things, but still bad at deciding what to do?
Reddit r/artificial

I Built an AI-Powered Chinese BaZi (八字) Fortune Teller — Here's What DeepSeek Revealed About Destiny
Dev.to