HiEdit: Lifelong Model Editing with Hierarchical Reinforcement Learning

arXiv cs.CL / 4/14/2026

📰 NewsIdeas & Deep AnalysisModels & Research

共有:

Key Points

HiEdit introduces a lifelong model editing approach for sequentially fixing outdated or incorrect knowledge in deployed LLMs while reducing unintended side effects on other inputs.
The work argues that knowledge is stored layer-wise rather than uniformly across all dense layers, and it therefore avoids applying the same set of parameter perturbations for every edit.
Using hierarchical reinforcement learning, HiEdit adaptively selects the most knowledge-relevant layers per editing instance and adds an intrinsic reward to encourage sparse, localized updates.
Experiments across multiple LLMs show HiEdit improves upon RLEdit by an average of 8.48% while perturbing only about half of the layers per edit, helping mitigate catastrophic forgetting risks.
The authors provide open-source code on GitHub to support replication and further experimentation with the proposed framework.

Abstract

Lifelong model editing (LME) aims to sequentially rectify outdated or inaccurate knowledge in deployed LLMs while minimizing side effects on unrelated inputs. However, existing approaches typically apply parameter perturbations to a static and dense set of LLM layers for all editing instances. This practice is counter-intuitive, as we hypothesize that different pieces of knowledge are stored in distinct layers of the model. Neglecting this layer-wise specificity can impede adaptability in integrating new knowledge and result in catastrophic forgetting for both general and previously edited knowledge. To address this, we propose HiEdit, a hierarchical reinforcement learning framework that adaptively identifies the most knowledge-relevant layers for each editing instance. By enabling dynamic, instance-aware layer selection and incorporating an intrinsic reward for sparsity, HiEdit achieves precise, localized updates. Experiments on various LLMs show that HiEdit boosts the performance of the competitive RLEdit by an average of 8.48% with perturbing only half of the layers per edit. Our code is available at: https://github.com/yangfanww/hiedit.