AI Navigate

Open-source, local document parsing CLI by LlamaIndex: LiteParse

Reddit r/LocalLLaMA / 3/20/2026

📰 NewsDeveloper Stack & InfrastructureTools & Practical Usage

Key Points

  • LiteParse is a lightweight, open-source CLI tool for local document parsing that preserves the original spatial layout to feed into LLMs, instead of reconstructing document structure.
  • It supports parsing PDFs, DOCX, XLSX, and images with layout preserved and includes built-in OCR with PaddleOCR or EasyOCR via HTTP for stronger results.
  • The tool adds screenshot capability so agents can reason over pages visually, enabling multimodal workflows while running entirely offline.
  • Output is designed to plug straight into AI agents, and the project notes that LlamaParse remains superior for complex layouts, but LiteParse covers many common use cases.

LiteParse is a lightweight CLI tool for local document parsing, born out of everything we learned building LlamaParse. The core idea is pretty simple: rather than trying to detect and reconstruct document structure, it preserves spatial layout as-is and passes that to your LLM. This works well in practice because LLMs are already trained on ASCII tables and indented text, so they understand the format naturally without you having to do extra wrangling.

A few things it can do:

  • Parse text from PDFs, DOCX, XLSX, and images with layout preserved
  • Built-in OCR, with support for PaddleOCR or EasyOCR via HTTP if you need something more robust
  • Screenshot capability so agents can reason over pages visually for multimodal workflows

Everything runs locally, no API calls, no cloud dependency. The output is designed to plug straight into agents.

For more complex documents (scanned PDFs with messy layouts, dense tables, that kind of thing) LlamaParse is still going to give you better results. But for a lot of common use cases this gets you pretty far without the overhead.

Would love to hear what you build with it or any feedback on the approach.

📖 Announcement
🔗 GitHub

submitted by /u/tuanacelik
[link] [comments]