AI Navigate

GONE: Structural Knowledge Unlearning via Neighborhood-Expanded Distribution Shaping

arXiv cs.CL / 3/16/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces GONE, a graph-based benchmark for evaluating knowledge unlearning of structured knowledge graph facts in LLMs, highlighting three effects: direct fact removal, reasoning-based leakage, and catastrophic forgetting.
  • It presents Neighborhood-Expanded Distribution Shaping (NEDS), a framework that uses graph connectivity to identify anchor neighbors and enforce a precise boundary between the forgotten fact and its semantic neighborhood.
  • Evaluations on LLaMA-3-8B and Mistral-7B across multiple editing/unlearning methods show NEDS achieving top performance (1.000 unlearning efficacy and 0.839 locality) on GONE and other benchmarks.
  • The work underscores safety, privacy, and IP implications of knowledge unlearning in structured data and provides code at the provided URL.

Abstract

Unlearning knowledge is a pressing and challenging task in Large Language Models (LLMs) because of their unprecedented capability to memorize and digest training data at scale, raising more significant issues regarding safety, privacy, and intellectual property. However, existing works, including parameter editing, fine-tuning, and distillation-based methods, are all focused on flat sentence-level data but overlook the relational, multi-hop, and reasoned knowledge in naturally structured data. In response to this gap, this paper introduces Graph Oblivion and Node Erasure (GONE), a benchmark for evaluating knowledge unlearning over structured knowledge graph (KG) facts in LLMs. This KG-based benchmark enables the disentanglement of three effects of unlearning: direct fact removal, reasoning-based leakage, and catastrophic forgetting. In addition, Neighborhood-Expanded Distribution Shaping (NEDS), a novel unlearning framework, is designed to leverage graph connectivity and identify anchor correlated neighbors, enforcing a precise decision boundary between the forgotten fact and its semantic neighborhood. Evaluations on LLaMA-3-8B and Mistral-7B across multiple knowledge editing and unlearning methods showcase NEDS's superior performance (1.000 on unlearning efficacy and 0.839 on locality) on GONE and other benchmarks. Code is available at https://anonymous.4open.science/r/GONE-4679/.