AI Navigate

CLAG: Adaptive Memory Organization via Agent-Driven Clustering for Small Language Model Agents

arXiv cs.CL / 3/17/2026

📰 NewsModels & Research

Key Points

  • CLAG introduces a clustering-based memory framework for small language model agents to organize experiences into semantically coherent clusters, reducing cross-topic interference.
  • The system uses an SLM-driven router to assign memories to clusters and autonomously generate cluster-specific profiles, including topic summaries and descriptive tags.
  • Retrieval is performed in two stages: first filtering relevant clusters via their profiles to exclude distractors, then searching within the selected clusters, thereby shrinking the search space.
  • Experiments on multiple QA datasets with three SLM backbones show that CLAG improves answer quality and robustness while remaining lightweight and efficient.

Abstract

Large language model agents heavily rely on external memory to support knowledge reuse and complex reasoning tasks. Yet most memory systems store experiences in a single global retrieval pool which can gradually dilute or corrupt stored knowledge. This problem is especially pronounced for small language models (SLMs), which are highly vulnerable to irrelevant context. We introduce CLAG, a CLustering-based AGentic memory framework where an SLM agent actively organizes memory by clustering. CLAG employs an SLM-driven router to assign incoming memories to semantically coherent clusters and autonomously generates cluster-specific profiles, including topic summaries and descriptive tags, to establish each cluster as a self-contained functional unit. By performing localized evolution within these structured neighborhoods, CLAG effectively reduces cross-topic interference and enhances internal memory density. During retrieval, the framework utilizes a two-stage process that first filters relevant clusters via their profiles, thereby excluding distractors and reducing the search space. Experiments on multiple QA datasets with three SLM backbones show that CLAG consistently improves answer quality and robustness over prior memory systems for agents, remaining lightweight and efficient.