TMTE: Effective Multimodal Graph Learning with Task-aware Modality and Topology Co-evolution
arXiv cs.LG / 3/31/2026
📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper identifies quality limitations in real-world multimodal-attributed graphs (MAGs), including noisy interactions, missing connections, and task-agnostic relational structures that reduce transfer across tasks.
- It proposes TMTE (Task-aware Modality and Topology co-Evolution), a closed-loop multimodal graph learning framework that jointly and iteratively optimizes both graph topology and multimodal representations for a specific target task.
- TMTE models topology evolution as multi-perspective metric learning over modality embeddings using an anchor-based approximation, while modality evolution uses smoothness-regularized fusion with cross-modal alignment.
- Experiments across 9 MAG datasets (plus 1 non-graph multimodal dataset) and 6 graph-centric/modality-centric tasks show consistent state-of-the-art performance gains.
- The authors provide code publicly (link in the paper) to support reproduction and further development of the TMTE approach.



