FlowExtract: Procedural Knowledge Extraction from Maintenance Flowcharts

arXiv cs.CV / 4/9/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisTools & Practical UsageModels & Research

共有:

Key Points

FlowExtract is a pipeline that extracts directed-graph procedural knowledge from ISO 5807-standard maintenance flowcharts that are typically locked in static PDFs or scanned images.
The approach decomposes the task into node detection and text extraction (YOLOv8 + EasyOCR for domain-aligned elements) and a separate connectivity/edge reconstruction stage.
For edge extraction, FlowExtract uses a novel method based on arrowhead orientation and tracing connecting lines backward to source nodes to recover the diagram’s topology.
Experiments on industrial troubleshooting guides show very high node detection performance and substantially improved edge extraction compared with vision-language model baselines.
The authors provide an open-source implementation (GitHub), positioning FlowExtract as a practical way to convert maintenance diagrams into queryable representations for operator support systems.

Abstract

Maintenance procedures in manufacturing facilities are often documented as flowcharts in static PDFs or scanned images. They encode procedural knowledge essential for asset lifecycle management, yet inaccessible to modern operator support systems. Vision-language models, the dominant paradigm for image understanding, struggle to reconstruct connection topology from such diagrams. We present FlowExtract, a pipeline for extracting directed graphs from ISO 5807-standardized flowcharts. The system separates element detection from connectivity reconstruction, using YOLOv8 and EasyOCR for standard domain-aligned node detection and text extraction, combined with a novel edge detection method that analyzes arrowhead orientations and traces connecting lines backward to source nodes. Evaluated on industrial troubleshooting guides, FlowExtract achieves very high node detection and substantially outperforms vision-language model baselines on edge extraction, offering organizations a practical path toward queryable procedural knowledge representations. The implementation is available athttps://github.com/guille-gil/FlowExtract.

Black Hat USA

AI Business

Black Hat Asia

AI Business

Amazon CEO takes aim at Nvidia, Intel, Starlink, more in annual shareholder letter

TechCrunch

Why Anthropic’s new model has cybersecurity experts rattled

Reddit r/artificial

Does the AI 2027 paper still hold any legitimacy?

Reddit r/artificial

FlowExtract: Procedural Knowledge Extraction from Maintenance Flowcharts

Key Points

Abstract

Related Articles

Black Hat USA

Black Hat Asia

Amazon CEO takes aim at Nvidia, Intel, Starlink, more in annual shareholder letter

Why Anthropic’s new model has cybersecurity experts rattled

Does the AI 2027 paper still hold any legitimacy?

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer