GraPHFormer: A Multimodal Graph Persistent Homology Transformer for the Analysis of Neuroscience Morphologies

arXiv cs.CV / 3/24/2026

📰 NewsIdeas & Deep AnalysisModels & Research

共有:

Key Points

GraPHFormerは、ニューラルモルフォロジー解析において「トポロジー」と「グラフ構造」を別々に扱っていた既存手法を、CLIPスタイルの対照学習で統合するマルチモーダルTransformerアーキテクチャとして提案されています。
ビジョン側は3チャネルのPersistence image（未重み・persistence重み・半径重み）をDINOv2-ViT-Sで処理し、並行してTreeLSTMがスケルトングラフから幾何・半径属性をエンコードして共通埋め込み空間へ写像します。
共有埋め込み空間は対称なInfoNCE損失で学習され、さらにトポロジー意味を保つためのpersistence空間変換が導入されています。
6つのベンチマークで、複数の条件（自己教師あり・教師あり）においてトポロジーのみ/グラフのみ/形態計測のみのベースラインを上回り、5つのベンチマークでSOTAを達成したと報告されています。
コードが公開されており、皮質領域や種間でのグリア形態の識別、発生過程や変性過程の兆候検出といった実利用例も示されています。

Abstract

Neuronal morphology encodes critical information about circuit function, development, and disease, yet current methods analyze topology or graph structure in isolation. We introduce GraPHFormer, a multimodal architecture that unifies these complementary views through CLIP-style contrastive learning. Our vision branch processes a novel three-channel persistence image encoding unweighted, persistence-weighted, and radius-weighted topological densities via DINOv2-ViT-S. In parallel, a TreeLSTM encoder captures geometric and radial attributes from skeleton graphs. Both project to a shared embedding space trained with symmetric InfoNCE loss, augmented by persistence-space transformations that preserve topological semantics. Evaluated on six benchmarks (BIL-6, ACT-4, JML-4, N7, M1-Cell, M1-REG) spanning self-supervised and supervised settings, GraPHFormer achieves state-of-the-art performance on five benchmarks, significantly outperforming topology-only, graph-only, and morphometrics baselines. We demonstrate practical utility by discriminating glial morphologies across cortical regions and species, and detecting signatures of developmental and degenerative processes. Code: https://github.com/Uzshah/GraPHFormer