Mitigating Forgetting in Continual Learning with Selective Gradient Projection

arXiv cs.LG / 3/31/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper addresses catastrophic forgetting in continual learning by proposing Selective Forgetting-Aware Optimization (SFAO), which controls how neural networks update when new tasks arrive.
SFAO regulates gradient directions using cosine similarity and per-layer gating, selectively projecting/accepting/discarding parameter updates to balance plasticity and stability.
It uses an efficient Monte Carlo approximation for the selective update mechanism, aiming to keep the method computationally practical.
Experiments on standard continual learning benchmarks show competitive accuracy, around a 90% reduction in memory cost, and improved forgetting performance on MNIST tasks.
The authors position SFAO as especially suitable for resource-constrained deployments where storing exemplars or large buffers is costly.

Abstract

As neural networks are increasingly deployed in dynamic environments, they face the challenge of catastrophic forgetting, the tendency to overwrite previously learned knowledge when adapting to new tasks, resulting in severe performance degradation on earlier tasks. We propose Selective Forgetting-Aware Optimization (SFAO), a dynamic method that regulates gradient directions via cosine similarity and per-layer gating, enabling controlled forgetting while balancing plasticity and stability. SFAO selectively projects, accepts, or discards updates using a tunable mechanism with efficient Monte Carlo approximation. Experiments on standard continual learning benchmarks show that SFAO achieves competitive accuracy with markedly lower memory cost, a 90

\%

reduction, and improved forgetting on MNIST datasets, making it suitable for resource-constrained scenarios.

[D] How does distributed proof of work computing handle the coordination needs of neural network training?

Reddit r/MachineLearning

BYOK is not just a pricing model: why it changes AI product trust

Dev.to

AI Citation Registries and Identity Persistence Across Records

Dev.to

Building Real-Time AI Voice Agents with Google Gemini 3.1 Flash Live and VideoSDK

Dev.to

Your Knowledge, Your Model: A Method for Deterministic Knowledge Externalization

Dev.to

Mitigating Forgetting in Continual Learning with Selective Gradient Projection

Key Points

Abstract

Related Articles

[D] How does distributed proof of work computing handle the coordination needs of neural network training?

BYOK is not just a pricing model: why it changes AI product trust

AI Citation Registries and Identity Persistence Across Records

Building Real-Time AI Voice Agents with Google Gemini 3.1 Flash Live and VideoSDK

Your Knowledge, Your Model: A Method for Deterministic Knowledge Externalization

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer