Looking for feedback on my Agentic RAG System

Reddit r/LocalLLaMA / 3/29/2026

💬 OpinionDeveloper Stack & InfrastructureSignals & Early TrendsTools & Practical Usage

共有:

Key Points

The author is seeking feedback on an “agentic RAG” system designed to be shippable rather than a basic upload-and-ask demo.
The system includes authenticated users with document ownership and document-scoped retrieval to reduce cross-document data leakage.
It uses an agent loop with tool calling where the retriever is exposed as a tool, along with query refinement, semantic caching, and pluggable embeddings with optional reranking.
The stack combines FastAPI, SQLAlchemy, Postgres (pgvector), and Chroma, with OpenAI/HuggingFace embeddings and an optional Cohere reranker, packaged via Docker.
It also features an evaluation pipeline with run history and case inspection plus a built-in UI for asking questions and running evaluations, with the repo provided for review.

Hey everyone,

I've been working on a RAG system and would really appreciate some feedback from people who have built or scaled similar systems.

This isn't just a basic "upload + ask" demo — I tried to design it more like something you'd actually ship.

This article is featured in our daily AI news digest — key takeaways and action items at a glance.

AI Business

Dev.to

Dev.to

Dev.to

Reddit r/MachineLearning