FlowExtract: Procedural Knowledge Extraction from Maintenance Flowcharts
arXiv cs.CV / 4/9/2026
📰 NewsSignals & Early TrendsIdeas & Deep AnalysisTools & Practical UsageModels & Research
Key Points
- FlowExtract is a pipeline that extracts directed-graph procedural knowledge from ISO 5807-standard maintenance flowcharts that are typically locked in static PDFs or scanned images.
- The approach decomposes the task into node detection and text extraction (YOLOv8 + EasyOCR for domain-aligned elements) and a separate connectivity/edge reconstruction stage.
- For edge extraction, FlowExtract uses a novel method based on arrowhead orientation and tracing connecting lines backward to source nodes to recover the diagram’s topology.
- Experiments on industrial troubleshooting guides show very high node detection performance and substantially improved edge extraction compared with vision-language model baselines.
- The authors provide an open-source implementation (GitHub), positioning FlowExtract as a practical way to convert maintenance diagrams into queryable representations for operator support systems.



