I’m looking for advice on setting up a local AI model that can generate Word reports automatically.

Reddit r/artificial / 4/13/2026

💬 OpinionDeveloper Stack & InfrastructureTools & Practical UsageModels & Research

Key Points

  • The author asks for guidance on building a locally runnable, privacy-preserving system that can automatically generate Word reports using an AI model.
  • They want the model to learn the structure of ~500 existing manually written reports, which consist of images plus text descriptions placed above each image.
  • The proposed capabilities include image understanding, generating structured text descriptions matching the existing report format, and exporting the results into a formatted Word document.
  • They are deciding between fine-tuning a vision-language model versus using a retrieval-based approach (RAG) grounded in their existing report corpus.
  • They request recommendations for specific models, tools, and workflows to implement this end-to-end pipeline offline.

Hi everyone,

I’m looking for advice on setting up a local AI model that can generate Word reports automatically.

I already have around 500 manually created reports, and I want to train or fine-tune a model to understand their structure and start generating new reports in the same format.

The reports are structured as:

- Images

- Text descriptions above each image

So basically, I need a system that can:

  1. Understand images

  2. Generate structured descriptions similar to my existing reports

  3. Export everything into a formatted Word document

I prefer something that can run locally (offline) for privacy reasons.

What would be the best models or approach for this?

- Should I fine-tune a vision-language model?

- Or use something like retrieval (RAG) with my existing reports?

Any recommendations (models, tools, or workflows) would be really appreciated 🙏

submitted by /u/Azab28
[link] [comments]