Hi everyone,
I’m looking for advice on setting up a local AI model that can generate Word reports automatically.
I already have around 500 manually created reports, and I want to train or fine-tune a model to understand their structure and start generating new reports in the same format.
The reports are structured as:
- Images
- Text descriptions above each image
So basically, I need a system that can:
Understand images
Generate structured descriptions similar to my existing reports
Export everything into a formatted Word document
I prefer something that can run locally (offline) for privacy reasons.
What would be the best models or approach for this?
- Should I fine-tune a vision-language model?
- Or use something like retrieval (RAG) with my existing reports?
Any recommendations (models, tools, or workflows) would be really appreciated 🙏
[link] [comments]




