I open-sourced a tool that compiles raw documents into an AI-navigable wiki with persistent memory; runs 100% locally

Reddit r/LocalLLaMA / 4/6/2026

📰 NewsDeveloper Stack & InfrastructureSignals & Early TrendsTools & Practical Usage

共有:

Key Points

The author open-sourced “aura-research,” a tool that ingests a folder of raw documents (PDFs, papers, notes, code, and 60+ formats) and compiles them into an AI-navigable Markdown wiki with cross-links and an index.
The tool packages knowledge into a compressed “.aura” archive optimized for retrieval, claiming about 97% smaller output than the original source data for more efficient RAG-style querying.
It avoids embeddings and vector databases, using SimHash and Bloom Filters for indexing, and relies on structured wiki navigation so the LLM can load only a small number of relevant articles at query time.
It includes a built-in “3-tier Memory OS” (facts, episodic, scratch pad) intended to preserve important context across sessions, while keeping an option to work with any LLM provider.
The entire workflow is designed to run locally (no data leaving the machine), and the author is soliciting community feedback—especially on the “structured wiki vs vector embeddings” tradeoff and potential productization ideas.

After seeing Karpathy's tweet about using LLMs to build personal wikis from research documents, I realized I'd already been using something similar like this internally for our R&D.

So I cleaned it up and open-sourced it.

What it does: You drop a folder of raw documents (PDFs, papers, notes, code, 60+ formats) and the LLM compiles them into a structured markdown wiki with backlinked articles, concept pages, and a master index. It then compresses everything into a .aura archive optimized for RAG retrieval (~97% smaller than raw source data).

How it works:

pip install aura-research research init my-project # copy docs into raw/ research ingest raw/ research compile research query "your question"

Key design decisions:

No embeddings, no vector databases. Uses SimHash + Bloom Filters instead. Zero RAM overhead.
Built-in 3-tier Memory OS (facts / episodic / scratch pad) so the LLM doesn't forget important context across sessions
The wiki is just .md files, browse in Obsidian, VS Code, or whatever you like
Works with any LLM provider (OpenAI, Anthropic, Gemini) or as an agent-native tool inside Claude Code/Gemini CLI where no API key is needed
Everything runs locally. No data leaves your machine.

The "no embeddings" choice: I deliberately avoided the standard RAG pipeline (chunk → embed → vector search). Instead, the LLM compiles the knowledge into a well-structured wiki with an index. When you query, it reads the index, finds the 2-3 relevant articles, and only loads those. The LLM is smart enough to navigate a good file structure, you don't need a separate embedding model if the knowledge is properly organized.

GitHub: https://github.com/Rtalabs-ai/aura-research PyPI: pip install aura-research

Would love feedback from this community, especially on the "structured wiki vs vector embeddings" tradeoff. Looking forward to your thoughts!

Also thinking about packaging this into a product, any insights would be appreciated!

submitted by /u/manoman42
[link] [comments]