Stylistic-STORM (ST-STORM) : Perceiving the Semantic Nature of Appearance
arXiv cs.CV / 4/20/2026
📰 NewsIdeas & Deep AnalysisModels & Research
Key Points
- The paper argues that many SSL methods (e.g., MoCo/DINO) rely on learning representations invariant to appearance changes, which can fail when appearance itself is the key discriminative signal.
- It introduces Stylistic-STORM (ST-STORM), a hybrid self-supervised learning framework that disentangles “content” from “style” by using two latent streams controlled via gating mechanisms.
- The Content branch is trained with a JEPA scheme plus a contrastive objective to remain stable and invariant to appearance variations.
- The Style branch is trained to capture appearance-specific signatures (textures, contrast, scattering) via feature prediction and reconstruction with an adversarial constraint.
- Experiments on ImageNet-1K, fine-grained weather characterization, and ISIC 2024 melanoma detection show strong style isolation performance (e.g., F1=97% for Multi-Weather, F1=94% on ISIC 2024 with 10% labeled data) while not degrading content semantics (F1=80% on ImageNet-1K).
Related Articles

From Theory to Reality: Why Most AI Agent Projects Fail (And How Mine Did Too)
Dev.to

GPT-5.4-Cyber: OpenAI's Game-Changer for AI Security and Defensive AI
Dev.to

Building Digital Souls: The Brutal Reality of Creating AI That Understands You Like Nobody Else
Dev.to
Local LLM Beginner’s Guide (Mac - Apple Silicon)
Reddit r/artificial

Is Your Skill Actually Good? Systematically Validating Agent Skills with Evals
Dev.to