AI Navigate

CLI that diagnoses broken RAG pipelines (looking for feedback)

Reddit r/LocalLLaMA / 3/13/2026

📰 NewsDeveloper Stack & InfrastructureTools & Practical Usage

Key Points

  • The author has built a CLI that analyzes a codebase to detect structural problems in RAG pipelines (LangChain, LlamaIndex, and custom pipelines) and is seeking feedback.
  • The tool is designed to be deterministic, with AI only used to explain findings in plain language so it can run in CI and produce reproducible results.
  • It aims to be like 'ESLint for RAG architectures,' parsing code, applying a rule engine, and reporting issues such as incorrect chunking, embedding-model mismatches, missing retrieval, context window overflow, misconfigured vector search, and prompt injection risks.
  • The author invites the community to test unusual pipelines and provides a GitHub repository link for feedback: https://github.com/NeuroForgeLabs/rag-doctor.

Hey everyone,

Over the past few months I’ve been building and testing different RAG setups (LangChain, LlamaIndex, custom pipelines, etc.), and I kept running into the same frustrating issue.

When a RAG system starts producing bad answers, everyone immediately blames the LLM.

But most of the time the actual problem is somewhere in the pipeline.

Things like:

• documents aren’t chunked correctly • embeddings don’t match the retrieval model • retrieval isn’t actually happening when you think it is • context window is overflowing • vector search is misconfigured • prompt injection risks 

After debugging this stuff over and over, I started building a small CLI tool that analyzes a codebase and tries to detect structural problems in RAG pipelines.

The idea is basically:

“ESLint but for RAG architectures.”

The tool parses the codebase, runs a rule engine, and reports possible issues.

One important design choice I made:

the analysis itself is deterministic. AI is only used to explain the findings in plain language.

That way the tool can still run in CI and produce reproducible results.

It’s still early, but I’m curious:

What RAG issues are you seeing most often in real projects?

Also if anyone wants to try breaking it with weird pipelines, that would actually be very helpful.

Repo:

https://github.com/NeuroForgeLabs/rag-doctor

Would really appreciate feedback from people building RAG systems.

submitted by /u/anvarxadja99
[link] [comments]