Sparse Task Vector Mixup with Hypernetworks for Efficient Knowledge Transfer in Whole-Slide Image Prognosis
arXiv cs.CV / 3/12/2026
📰 NewsModels & Research
Key Points
- The STEPH method introduces sparse task vector mixup with hypernetworks to transfer prognostic knowledge across cancer types for whole-slide image prognosis.
- It applies task vector mixups to each source-target cancer pair and sparsely aggregates the mixtures to build an improved target model, guided by hypernetworks.
- The approach reduces dependence on large-scale joint training or extensive multi-model inference, offering a more computationally efficient knowledge transfer solution.
- Experiments on 13 cancer datasets show STEPH outperforms cancer-specific learning by 5.14% and a prior knowledge-transfer baseline by 2.01%.
- The authors provide publicly available code at GitHub.
Related Articles
Built a small free iOS app to reduce LLM answer uncertainty with multiple models
Dev.to
![[P] We built a Weights & Biases for Autoresearch - track steps, compare experiments, and share results](/_next/image?url=https%3A%2F%2Fpreview.redd.it%2Flv7w6809f7qg1.png%3Fwidth%3D140%26height%3D75%26auto%3Dwebp%26s%3De77e7b54776d5a33eb092415d26190352ad20577&w=3840&q=75)
[P] We built a Weights & Biases for Autoresearch - track steps, compare experiments, and share results
Reddit r/MachineLearning

Mistral Small 4 vs Qwen3.5-9B on document understanding benchmarks, but it does better than GPT-4.1
Reddit r/LocalLLaMA
Nvidia built a silent opinion engine into NemotronH to gaslight you and they're not the only ones doing it
Reddit r/LocalLLaMA

Ooh, new drama just dropped 👀
Reddit r/LocalLLaMA