CURaTE: Continual Unlearning in Real Time with Ensured Preservation of LLM Knowledge

arXiv cs.CL / 4/17/2026

📰 NewsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper argues that LLMs need post-training “unlearning” methods because it is impossible to perfectly pre-filter all problematic data during pre-training.
It introduces CURaTE, which uses a separately trained sentence-embedding model to detect similarity between incoming prompts and stored “forget” requests and then either answers or refuses.
CURaTE is designed for continuous, immediate (real-time) unlearning so that utility does not degrade as more updates accumulate.
The authors report that CURaTE forgets more effectively than existing methods while preserving knowledge near-perfectly across any number of updates because it does not modify the LLM parameters.
The work claims CURaTE is the only approach that supports continual unlearning in real time without changing the model weights.

Abstract

The inability to filter out in advance all potentially problematic data from the pre-training of large language models has given rise to the need for methods for unlearning specific pieces of knowledge after training. Existing techniques overlook the need for continuous and immediate action, causing them to suffer from degraded utility as updates accumulate and protracted exposure of sensitive information. To address these issues, we propose Continual Unlearning in Real Time with Ensured Preservation of LLM Knowledge (CURaTE). Our method begins by training a sentence embedding model on a dataset designed to enable the formation of sharp decision boundaries for determining whether a given input prompt corresponds to any stored forget requests. The similarity of a given input to the forget requests is then used to determine whether to answer or return a refusal response. We show that even with such a simple approach, not only does CURaTE achieve more effective forgetting than existing methods, but by avoiding modification of the language model parameters, it also maintains near perfect knowledge preservation over any number of updates and is the only method capable of continual unlearning in real-time.