Labeled TrustSet Guided: Batch Active Learning with Reinforcement Learning

arXiv cs.LG / 4/15/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper addresses limitations of traditional batch active learning by proposing TrustSet, which selects informative samples from labeled data while enforcing balanced class distribution to reduce long-tail effects.
  • TrustSet improves on approaches like CoreSet by using labeled feedback and model-oriented criteria (pruning redundancy) rather than relying mainly on unlabeled-data distribution metrics such as Mahalanobis distance.
  • To extend TrustSet’s labeled-data gains to the unlabeled pool, the authors introduce an RL-based sampling policy that approximates choosing high-quality TrustSet candidates from unlabeled data.
  • The combined method, BRAL-T (Batch Reinforcement Active Learning with TrustSet), is reported to reach state-of-the-art performance across 10 image classification benchmarks and 2 active fine-tuning tasks.
  • Overall, the work aims to reduce labeling costs and improve data efficiency for training large-scale deep learning models by leveraging both labeled information and reinforcement learning-driven selection.

Abstract

Batch active learning (BAL) is a crucial technique for reducing labeling costs and improving data efficiency in training large-scale deep learning models. Traditional BAL methods often rely on metrics like Mahalanobis Distance to balance uncertainty and diversity when selecting data for annotation. However, these methods predominantly focus on the distribution of unlabeled data and fail to leverage feedback from labeled data or the model's performance. To address these limitations, we introduce TrustSet, a novel approach that selects the most informative data from the labeled dataset, ensuring a balanced class distribution to mitigate the long-tail problem. Unlike CoreSet, which focuses on maintaining the overall data distribution, TrustSet optimizes the model's performance by pruning redundant data and using label information to refine the selection process. To extend the benefits of TrustSet to the unlabeled pool, we propose a reinforcement learning (RL)-based sampling policy that approximates the selection of high-quality TrustSet candidates from the unlabeled data. Combining TrustSet and RL, we introduce the Batch Reinforcement Active Learning with TrustSet (BRAL-T) framework. BRAL-T achieves state-of-the-art results across 10 image classification benchmarks and 2 active fine-tuning tasks, demonstrating its effectiveness and efficiency in various domains.