Efficient Image Annotation via Semi-Supervised Object Segmentation with Label Propagation

arXiv cs.CV / 4/28/2026

📰 NewsDeveloper Stack & InfrastructureSignals & Early TrendsModels & Research

Key Points

  • The paper proposes a semi-supervised label propagation method to perform household object segmentation for service robots, reducing reliance on fully labeled training data.
  • A class-agnostic segment proposer generates masks, while an ensemble of Hopfield networks assigns labels by learning representative embeddings across multiple foundation-model embedding spaces (CLIP, ViT, and Theia).
  • The method is reported to scale to 50 object classes with limited annotation effort, addressing the generalization limits of open-vocabulary detectors that work well only on a small set of categories.
  • In a RoboCup@Home context with strict time constraints, the approach is claimed to automatically label about 60% of the dataset.
  • The dataset and code are released publicly, enabling others to reproduce and build on the label propagation pipeline.

Abstract

Reliable object perception is necessary for general-purpose service robots. Open-vocabulary detectors struggle to generalize beyond a few classes and fully supervised training of object detectors requires time-intensive annotations. We present a semi-supervised label propagation approach for household object segmentation. A segment proposer generates class-agnostic masks, and an ensemble of Hopfield networks assigns labels by learning representative embeddings in complementary foundation model embedding spaces (CLIP, ViT, Theia). Our approach scales to 50 object classes with limited annotation overhead and can automatically label 60% of the data in a RoboCup@Home setting, where preparation time is severely constrained. Dataset and code are publicly available at https://github.com/ais-bonn/label_propagation.