Topological Data Analysis-friendly CAD/3D point cloud dataset [P]

Reddit r/MachineLearning / 4/29/2026

💬 OpinionSignals & Early TrendsTools & Practical Usage

Key Points

  • The author is seeking a 3D point cloud (or CAD/mesh) dataset suitable for research comparing Topological Data Analysis (TDA) against standard point-cloud preprocessing methods.
  • They want datasets where classes differ primarily by topological structure (e.g., number of holes/loops/cavities) so TDA yields a meaningful signal for downstream classification.
  • The intended evaluation is based on downstream model classification accuracy after preprocessing, under perturbations such as Gaussian noise, point deletion/subsampling, small deformations, scaling/rotations, and synthetic outliers/corruptions.
  • The author prefers either many samples per class (600+) or enough CAD/mesh models to generate many point-cloud samples, and notes that even binary classification would be sufficient.
  • They ask the community for recommendations of existing datasets plus CAD repositories, synthetic dataset generators, or benchmarks from which such class pairs can be extracted.

Hi everyone,

I’m looking for a suitable 3D point cloud dataset — or a CAD/mesh dataset from which I can sample point clouds — for a small research/report project.

The goal is to compare Topological Data Analysis (TDA) as a preprocessing / feature extraction method against more standard 3D point cloud preprocessing methods, under different perturbations such as:

  • Gaussian jitter / noise
  • random point deletion / subsampling
  • small deformations
  • scaling / rotations
  • outliers or other synthetic corruptions

The comparison would be based on the classification accuracy of a downstream model after preprocessing.

I do not necessarily need many classes. Even a binary classification dataset would be enough. What matters most is that the classes should differ in their topological structure, ideally in the number of holes / loops / cavities, so that TDA has a meaningful signal to detect.

For example, something like:

  • sphere / ball-like objects vs torus / ring-like objects
  • solid object vs object with a tunnel
  • objects with different numbers of handles or holes

Ideally, each class should contain many samples (600+), or the dataset should contain enough CAD/mesh models so that I can sample many point clouds from them.

Does anyone know of a dataset that fits this description? I would also appreciate suggestions for CAD repositories, synthetic dataset generators, or benchmark datasets where such class pairs could be extracted.

Thanks!

submitted by /u/generalbrain_damage
[link] [comments]