RS-OVC: Open-Vocabulary Counting for Remote-Sensing Data

arXiv cs.CV / 4/13/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper highlights a key limitation in remote-sensing object-counting: most methods only work for a closed set of object classes seen during training, requiring re-annotation and retraining to handle new classes.
  • It proposes RS-OVC, the first open-vocabulary counting model tailored to remote-sensing and aerial imagery, enabling counting of novel object categories without having seen them during training.
  • RS-OVC is designed to perform this open-vocabulary counting using textual and/or visual conditioning as guidance signals for which objects to count.
  • The authors report that the model can accurately count classes that are unseen during training, aiming to make RS monitoring more adaptable to real-world dynamics.

Abstract

Object-Counting for remote-sensing (RS) imagery is attracting increasing research interest due to its crucial role in a wide and diverse set of applications. While several promising methods for RS object-counting have been proposed, existing methods focus on a closed, pre-defined set of object classes. This limitation necessitates costly re-annotation and model re-training to adapt current approaches for counting of novel objects that have not been seen during training, and severely inhibits their application in dynamic, real-world monitoring scenarios. To address this gap, in this work we propose RS-OVC - the first Open Vocabulary Counting (OVC) model for Remote-Sensing and aerial imagery. We show that our model is capable of accurate counting of novel object classes, that were unseen during training, based solely on textual and/or visual conditioning.