BrainDB: Karpathy's 'LLM wiki' idea, but as a real DB with typed entities and a graph

Reddit r/LocalLLaMA / 4/20/2026

💬 OpinionDeveloper Stack & InfrastructureSignals & Early TrendsTools & Practical UsageModels & Research

Key Points

  • BrainDB is inspired by Karpathy’s “LLM wiki” idea: giving an LLM persistent external memory it can read and write, but implemented as a structured database with retrieval and graph features.
  • Unlike stateless RAG, BrainDB stores typed, persistent entities (e.g., thoughts, facts, sources, rules) with explicit relations like supports/contradicts/derived_from and performs fuzzy+semantic search plus short graph traversal.
  • The system returns a ranked graph neighborhood (not a bag of retrieved text chunks) and uses temporal decay so older or stale items fade while frequently accessed ones remain salient.
  • Compared with classic graph databases, BrainDB is purpose-built for LLM agents with an HTTP API for tool-calling, semantically meaningful fields (e.g., certainty, importance), automatic provenance, built-in rule injection, and retrieval scoring using Postgres extensions.
  • BrainDB is positioned as a more queryable alternative to flat Markdown memories by extracting and linking facts back to sources automatically and only loading full text when an agent explicitly requests it.
BrainDB: Karpathy's 'LLM wiki' idea, but as a real DB with typed entities and a graph

Why BrainDB?

Inspired by Karpathy's LLM wiki idea — give an LLM a persistent external memory it can read and write. BrainDB takes that further by adding structure, retrieval, and a graph on top of the "plain markdown files" baseline.

  • vs. RAG. RAG is stateless: embed documents, retrieve similar chunks on every query, stuff them into context. There's no notion of an entity that persists, accrues connections, or ages. BrainDB stores typed entities (thoughts, facts, sources, documents, rules) with explicit supports / contradicts / elaborates / derived_from / similar_to relations, combined fuzzy + semantic search, graph traversal up to 3 hops, and temporal decay so stale items fade while accessed ones stay sharp. Retrieval returns a ranked graph neighbourhood, not a pile of chunks.
  • vs. classic graph DBs (Neo4j, Memgraph). Those are general-purpose graph stores with their own query languages and ops cost. BrainDB is purpose-built for LLM agents: a plain HTTP API designed for tool-calling, semantically meaningful fields (certainty, importance, emotional_valence), built-in text + pgvector search with geometric-mean scoring, always-on rule injection, automatic provenance, and runs on plain PostgreSQL + pg_trgm + pgvector — no new infrastructure to operate.
  • vs. markdown files as memory. Markdown wikis are flat and unstructured: the LLM has to grep, read whole files into context, and manage linking by hand. BrainDB's entities are atomic, queryable, ranked, and self-connecting. Facts extracted from a document automatically link back to the source via derived_from; recall returns relevant nodes plus their graph neighbourhood; nothing needs to be read in full unless the agent asks for it.
submitted by /u/dimknaf
[link] [comments]