APEX-MEM: Agentic Semi-Structured Memory with Temporal Reasoning for Long-Term Conversational AI

arXiv cs.CL / 4/17/2026

📰 NewsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper introduces APEX-MEM, a long-term conversational memory system that addresses reliability issues caused by context-window expansion or naive retrieval, which can add noise and destabilize responses.
APEX-MEM structures conversations using a domain-agnostic ontology in a property graph, representing temporally grounded events within an entity-centric framework.
It uses append-only storage to retain the full temporal evolution of information across interactions.
A multi-tool retrieval agent resolves conflicting or changing information at query time and outputs a compact, contextually relevant memory summary while suppressing irrelevant details.
The system reports strong results on LOCOMO’s QA task (88.88%) and LongMemEval (86.2%), outperforming session-aware approaches and showing the benefit of property graphs for temporally coherent reasoning.

Abstract

Large language models still struggle with reliable long-term conversational memory: simply enlarging context windows or applying naive retrieval often introduces noise and destabilizes responses. We present APEX-MEM, a conversational memory system that combines three key innovations: (1) a property graph which uses domain-agnostic ontology to structure conversations as temporally grounded events in an entity-centric framework, (2) append-only storage that preserves the full temporal evolution of information, and (3) a multi-tool retrieval agent that understands and resolves conflicting or evolving information at query time, producing a compact and contextually relevant memory summary. This retrieval-time resolution preserves the full interaction history while suppressing irrelevant details. APEX-MEM achieves 88.88% accuracy on LOCOMO's Question Answering task and 86.2% on LongMemEval, outperforming state-of-the-art session-aware approaches and demonstrating that structured property graphs enable more temporally coherent long-term conversational reasoning.