Crowdsourcing of Real-world Image Annotation via Visual Properties
arXiv cs.CV / 4/17/2026
💬 OpinionTools & Practical UsageModels & Research
Key Points
- The paper addresses the “semantic gap” in object recognition datasets by showing how visual data-to-language mappings can be complex and bias model performance in computer vision.
- It proposes an image annotation approach that combines knowledge representation, natural language processing, and computer vision, using visual property constraints to reduce annotator subjectivity.
- An interactive crowdsourcing framework is introduced that asks dynamically generated questions guided by a predefined object category hierarchy and real-time annotator feedback.
- Experiments indicate that the proposed methodology is effective, and the authors analyze annotator feedback to further optimize the crowdsourcing setup.

