Been building widemem, an open-source memory layer for LLM agents. Runs fully local with SQLite + FAISS, no cloud, no accounts. Apache 2.0.
The problem I kept hitting: vector stores always return something, even when they have nothing useful. You ask about a user's doctor and the closest match is their lunch order at 0.3 similarity. The LLM sees that context and confidently makes up a doctor's name.
So I added confidence scoring. Every search now comes back with HIGH, MODERATE, LOW, or NONE. Plus three modes you can pick:
- **strict**: only returns what it's confident about, says "I don't know" otherwise
- **helpful** (default): returns confident stuff normally, flags uncertain results
- **creative**: "I don't have that stored but I can guess if you want"
Also added `mem.pin()` for facts that should never fade (allergies, blood type, that kind of thing). And frustration detection, so when a user says "I already told you this" the system searches harder and boosts that memory.
There's also retrieval modes now: fast (cheap, 10 results), balanced (default, 25 results), deep (50 results for when accuracy matters more than cost).
Still local-first. Still zero external services. Works with Ollama + sentence-transformers if you want to stay fully offline.
GitHub: https://github.com/remete618/widemem-ai
Install: `pip install widemem-ai`
Would love feedback on the confidence thresholds. They work well with sentence-transformers and text-embedding-3-small but I haven't tested every model out there. If the thresholds feel off with your setup let me know.
[link] [comments]