FILTR: Extracting Topological Features from Pretrained 3D Models

arXiv cs.CV / 4/27/2026

📰 NewsModels & Research

Key Points

  • The paper investigates whether topological descriptors (persistence diagrams) can be extracted from feature representations produced by pretrained 3D point-cloud encoders such as Point-BERT and Point-MAE.
  • It introduces DONUT, a synthetic benchmark designed to control and vary topological complexity, enabling systematic evaluation of topological recovery from learned features.
  • The authors propose FILTR (Filtration Transformer), a learnable framework that predicts persistence diagrams directly from frozen 3D encoders by reframing diagram generation as a set prediction problem using a transformer decoder.
  • Experiments on DONUT suggest that existing encoders preserve only limited global topological information, but FILTR can still approximate persistence diagrams by effectively leveraging the retained signals.
  • The approach is positioned as the first data-driven way to extract persistence diagrams from raw point clouds using an efficient learnable feed-forward mechanism.

Abstract

Recent advances in pretraining 3D point cloud encoders (e.g., Point-BERT, Point-MAE) have produced powerful models, whose abilities are typically evaluated on geometric or semantic tasks. At the same time, topological descriptors have been shown to provide informative summaries of a shape's multiscale structure. In this paper we pose the question whether topological information can be derived from features produced by 3D encoders. To address this question, we first introduce DONUT, a synthetic benchmark with controlled topological complexity, and propose FILTR (Filtration Transformer), a learnable framework to predict persistence diagrams directly from frozen encoders. FILTR adapts a transformer decoder to treat diagram generation as a set prediction task. Our analysis on DONUT reveals that existing encoders retain only limited global topological signals, yet FILTR successfully leverages information produced by these encoders to approximate persistence diagrams. Our approach enables, for the first time, data-driven extraction of persistence diagrams from raw point clouds through an efficient learnable feed-forward mechanism.