EviMem: Evidence-Gap-Driven Iterative Retrieval for Long-Term Conversational Memory

arXiv cs.CL / 5/1/2026

💬 OpinionTools & Practical UsageModels & Research

共有:

Key Points

The paper introduces EviMem, a method for long-term conversational memory that performs evidence-gap-driven iterative retrieval rather than relying on single-pass or untargeted refinement.
EviMem uses a closed-loop framework (IRIS) that evaluates sufficiency to detect what evidence is missing from the current retrieval set, then diagnoses the gap and refines the query accordingly.
It also proposes LaceMem, a layered coarse-to-fine memory hierarchy that supports fine-grained diagnosis of evidence gaps across sessions.
Experiments on LoCoMo show EviMem improves Judge Accuracy versus MIRIX for temporal questions (73.3% to 81.6%) and multi-hop questions (65.9% to 85.2%) while achieving 4.5× lower latency.
The authors provide an implementation via a GitHub repository, enabling replication and further development of the approach.

Abstract

Long-term conversational memory requires retrieving evidence scattered across multiple sessions, yet single-pass retrieval fails on temporal and multi-hop questions. Existing iterative methods refine queries via generated content or document-level signals, but none explicitly diagnoses the evidence gap, namely what is missing from the accumulated retrieval set, leaving query refinement untargeted. We present EviMem, combining IRIS (Iterative Retrieval via Insufficiency Signals), a closed-loop framework that detects evidence gaps through sufficiency evaluation, diagnoses what is missing, and drives targeted query refinement, with LaceMem (Layered Architecture for Conversational Evidence Memory), a coarse-to-fine memory hierarchy supporting fine-grained gap diagnosis. On LoCoMo, EviMem improves Judge Accuracy over MIRIX on temporal (73.3% to 81.6%) and multi-hop (65.9% to 85.2%) questions at 4.5x lower latency. Code: https://github.com/AIGeeksGroup/EviMem.

Black Hat USA

AI Business

Why Autonomous Coding Agents Keep Failing — And What Actually Works

Dev.to

Text-to-image is easy. Chaining LLMs to generate, critique, and iterate on images autonomously is a routing nightmare. AgentSwarms now supports Image generation playground and creative media workflows!

Reddit r/artificial

Announcing the NVIDIA Nemotron 3 Super Build Contest

Dev.to

75% of Sites Blocking AI Bots Still Get Cited. Here Is Why Blocking Does Not Work.

Dev.to

EviMem: Evidence-Gap-Driven Iterative Retrieval for Long-Term Conversational Memory

Key Points

Abstract

Related Articles

Black Hat USA

Why Autonomous Coding Agents Keep Failing — And What Actually Works

Text-to-image is easy. Chaining LLMs to generate, critique, and iterate on images autonomously is a routing nightmare. AgentSwarms now supports Image generation playground and creative media workflows!

Announcing the NVIDIA Nemotron 3 Super Build Contest

75% of Sites Blocking AI Bots Still Get Cited. Here Is Why Blocking Does Not Work.

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer