Data Synthesis Improves 3D Myotube Instance Segmentation

arXiv cs.CV / 4/17/2026

📰 NewsDeveloper Stack & InfrastructureModels & Research

Key Points

  • The paper addresses the challenge of performing accurate 3D instance segmentation of myotube structures for quantitative studies, noting that existing pretrained biomedical models do not generalize well due to limited annotated data.
  • It proposes a geometry-driven synthetic data pipeline that generates realistic myotube volumes using polynomial centerlines, spatially varying radii, branching geometry, and ellipsoidal end caps based on microscopy.
  • The method renders synthetic images with realistic noise and optical artifacts and applies CycleGAN-based domain adaptation to better match real microscopy appearance.
  • A compact 3D U-Net pretrained with self-supervised learning and trained only on synthetic data achieves a mean IPQ of 0.22 on real-world data and outperforms several established zero-shot approaches.
  • Overall, the results suggest that biophysics-informed, domain-adapted synthesis can enable effective instance segmentation in biomedical settings where annotations are scarce.

Abstract

Myotubes are multinucleated muscle fibers serving as key model systems for studying muscle physiology, disease mechanisms, and drug responses. Mechanistic studies and drug screening thereby rely on quantitative morphological readouts such as diameter, length, and branching degree, which in turn require precise three-dimensional instance segmentation. Yet established pretrained biomedical segmentation models fail to generalize to this domain due to the absence of large annotated myotube datasets. We introduce a geometry-driven synthesis pipeline that models individual myotubes via polynomial centerlines, locally varying radii, branching structures, and ellipsoidal end caps derived from real microscopy observations. Synthetic volumes are rendered with realistic noise, optical artifacts, and CycleGAN-based Domain Adaptation (DA). A compact 3D U-Net with self-supervised encoder pretraining, trained exclusively on synthetic data, achieves a mean IPQ of 0.22 on real data, significantly outperforming three established zero-shot segmentation models, demonstrating that biophysics-driven synthesis enables effective instance segmentation in annotation-scarce biomedical domains.