Dynamic Distillation and Gradient Consistency for Robust Long-Tailed Incremental Learning

arXiv cs.CV / 5/6/2026

📰 NewsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper targets Long-tailed Class Incremental Learning (LT-CIL), where new classes arrive sequentially with highly imbalanced distributions, making catastrophic forgetting worse while also causing under-learning of minority classes and overfitting of majority classes.
It introduces gradient consistency regularization, using a moving average of gradients to reduce abrupt training changes and improve stability.
It proposes dynamically re-weighting the distillation loss based on class imbalance measured via normalized entropy, aiming to balance old-knowledge retention with learning new information.
Experiments on CIFAR-100-LT, ImageNetSubset-LT, and Food101-LT show accuracy gains up to 5.0% and especially large improvements in the difficult “In-ordered” setting (majority-to-minority task order).
The authors report that the gains are achieved without significant additional computational overhead, supporting the method’s practical deployment potential.

Abstract

The task of Long-tailed Class Incremental Learning (LT-CIL) addresses the sequential learning of new classes from datasets with imbalanced class distributions. This scenario intensifies the fundamental problem of catastrophic forgetting, inherent to continual learning, with the dual challenges of under-learning minority classes and overfitting majority classes. To tackle these combined issues, this paper proposes two main techniques. First, we introduce gradient consistency regularization, which leverages the moving average of gradients to suppress abrupt fluctuations and stabilize the training process. Second, we dynamically adjust the weight of the distillation loss by measuring the degree of class imbalance with normalized entropy. This adaptive weighting establishes an optimal balance between retaining old knowledge and acquiring new information. Experiments on the CIFAR-100-LT, ImageNetSubset-LT, and Food101-LT benchmarks show that our method achieves consistent accuracy improvements of up to 5.0\%. Furthermore, we demonstrate dramatic gains in the challenging 'In-ordered' setting, where tasks progress from majority to minority classes, highlighting our method's robustness in mitigating forgetting under unfavorable learning dynamics. This enhanced performance is achieved without a significant increase in computational overhead, demonstrating the practicality of our framework.

Top 10 Free AI Tools for Students in 2026: The Ultimate Study Guide

Dev.to

AI as Your Contingency Co-Pilot: Automating Wedding Day 'What-Ifs'

Dev.to

Google AI Releases Multi-Token Prediction (MTP) Drafters for Gemma 4: Delivering Up to 3x Faster Inference Without Quality Loss

MarkTechPost

When Claude Hallucinates in Court: The Latham & Watkins Incident and What It Means for Attorney Liability

MarkTechPost

Solidity LM surpasses Opus

Reddit r/LocalLLaMA

Dynamic Distillation and Gradient Consistency for Robust Long-Tailed Incremental Learning

Key Points

Abstract

Related Articles

Top 10 Free AI Tools for Students in 2026: The Ultimate Study Guide

AI as Your Contingency Co-Pilot: Automating Wedding Day 'What-Ifs'

Google AI Releases Multi-Token Prediction (MTP) Drafters for Gemma 4: Delivering Up to 3x Faster Inference Without Quality Loss

When Claude Hallucinates in Court: The Latham & Watkins Incident and What It Means for Attorney Liability

Solidity LM surpasses Opus

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer