Project Shadows: Turns out "just add memory" doesn't fix your agent

Reddit r/artificial / 4/20/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The author describes building “Shadows,” a multi-agent system with nine agents collaborating using a shared memory layer, and reports strong retrieval performance on LongMemEval (recall_all@5 at 97%).
Despite having the right memories retrieved, the agent still produces incorrect answers, indicating that adding memory alone does not solve agent reasoning failures.
The write-up attributes errors to limitations such as poor cross-session aggregation, lack of calibration for when to abstain, and difficulty interpreting which part of a user preference the user intended.
The author contrasts agent behavior with human workflows, arguing that people typically filter and verify identity/context before executing, unlike many LLM agents that jump straight to action.
The proposed next step is to develop agents that can be moved/controlled alongside their identity and memory, aiming to better align behavior with the needed pre-filtering process.

Project Shadows: Turns out "just add memory" doesn't fix your agent

Been building a multi-agent system called Shadows for a few months. Nine agents collaborating on strategy work with a shared memory layer.

I spent most of my time on retrieval because that's what every benchmark measures. Mem0, MemPalace, Graphiti, all of them.

On LongMemEval, recall_all@5 hit 97%. Overall accuracy was 73%.

So the right memories are there. The agent still picks the wrong answer. It can't aggregate across sessions, doesn't know when to abstain, and guesses which aspect of a preference the user meant.

That lined up with something I've been stuck on. Most LLMs jump straight to execution when you give them a task. People don't. We filter first, check if we're even the right person, then start.

Next direction: Agents that can be moved with their identity and memory!

submitted by /u/MegaWa7edBas
[link] [comments]

💡 Insights using this article

This article is featured in our daily AI news digest — key takeaways and action items at a glance.

📅 4/20DailyView insight →

The Agent Contract Problem: When Your Agent Commits to Something It Can't Deliver

Dev.to

How to Turn Any SaaS Into a Telegram Bot in 30 Minutes Using OpenClaw

Dev.to

Headless everything for personal AI

Simon Willison's Blog

Which model to summarize rss news articles

Reddit r/LocalLLaMA

Meta-Optimized Continual Adaptation for bio-inspired soft robotics maintenance with zero-trust governance guarantees

Dev.to

Project Shadows: Turns out "just add memory" doesn't fix your agent

Key Points

💡 Insights using this article

Related Articles

The Agent Contract Problem: When Your Agent Commits to Something It Can't Deliver

How to Turn Any SaaS Into a Telegram Bot in 30 Minutes Using OpenClaw

Headless everything for personal AI

Which model to summarize rss news articles

Meta-Optimized Continual Adaptation for bio-inspired soft robotics maintenance with zero-trust governance guarantees

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer