HAD: Heterogeneity-Aware Distillation for Lifelong Heterogeneous Learning
arXiv cs.CV / 3/30/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper introduces “lifelong heterogeneous learning (LHL),” a setting where a model must learn a sequence of tasks with different output structures while retaining prior knowledge.
- It instantiates LHL in dense prediction (LHL4DP) as a realistic, challenging scenario involving preservation of heterogeneous knowledge such as across different pixel/region-level output behaviors.
- The authors propose Heterogeneity-Aware Distillation (HAD), an exemplar-free self-distillation approach that distills previously learned knowledge at each training phase.
- HAD includes a distribution-balanced heterogeneity-aware distillation loss to address global prediction imbalance, and a salience-guided loss that emphasizes informative edge pixels identified via the Sobel operator.
- Experiments reported in the work indicate HAD significantly outperforms existing methods on this newly formalized LHL4DP benchmark/task setting.
Related Articles

Black Hat Asia
AI Business

Mr. Chatterbox is a (weak) Victorian-era ethically trained model you can run on your own computer
Simon Willison's Blog
Beyond the Chatbot: Engineering Multi-Agent Ecosystems in 2026
Dev.to

I missed the "fun" part in software development
Dev.to

The Billion Dollar Tax on AI Agents
Dev.to