Energy-Based Open-Set Active Learning for Object Classification

arXiv cs.LG / 4/23/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper addresses a key limitation of traditional active learning by moving from closed-set to open-set settings, where unlabeled data may include both known and unknown object classes.
  • It introduces a dual-stage, energy-based active learning framework that first filters likely known samples using an energy-based known/unknown separator and then ranks the filtered samples by informativeness with a second energy-based scorer.
  • The method leverages an energy landscape to assign lower energy to known-class samples and higher energy to unknown-class samples, reducing waste of annotation budgets on irrelevant categories.
  • Experiments across 2D and 3D object classification benchmarks (CIFAR-10/100, TinyImageNet, and ModelNet40) show improved annotation efficiency and classification performance compared with existing open-set approaches.

Abstract

Active learning (AL) has emerged as a crucial methodology for minimizing labeling costs in deep learning by selecting the most valuable samples from a pool of unlabeled data for annotation. Traditional AL operates under a closed-set assumption, where all classes in the dataset are known and consistent. However, real-world scenarios often present open-set conditions in which unlabeled data contains both known and unknown classes. In such environments, standard AL techniques struggle. They can mistakenly query samples from unknown categories, leading to inefficient use of annotation budgets. In this paper, we propose a novel dual-stage energy-based framework for open-set AL. Our method employs two specialized energy-based models (EBMs). The first, an energy-based known/unknown separator, filters out samples likely to belong to unknown classes. The second, an energy-based sample scorer, assesses the informativeness of the filtered known samples. Using the energy landscape, our models distinguish between data points from known and unknown classes in the unlabeled pool by assigning lower energy to known samples and higher energy to unknown samples, ensuring that only samples from classes of interest are selected for labeling. By integrating these components, our approach ensures efficient and targeted sample selection, maximizing learning impact in each iteration. Experiments on 2D (CIFAR-10, CIFAR-100, TinyImageNet) and 3D (ModelNet40) object classification benchmarks demonstrates that our framework outperforms existing approaches, achieving superior annotation efficiency and classification performance in open-set environments.