What image/video training data is hardest to find right now? [R]

Reddit r/MachineLearning / 4/10/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisTools & Practical Usage

共有:

Key Points

A builder of a crowdsourced photo/video collection platform asks the community what types of image data they wish existed but are currently hard to obtain for training computer vision models.
The platform’s pipeline is described as smartphone photo collection with automatic labeling using YOLO/CLIP and enrichment via 40+ metadata fields such as weather, time, GPS, and OCR.
Proposed high-demand dataset concepts include European street scenes (e.g., Switzerland/France), supermarket shelves with OCR-extracted prices, analog utility meters, restaurant menus with prices, and EV charging stations categorized by type.
The post frames the question as an input-gathering step to decide what data to collect first, emphasizing practical utility for real model-building use cases.

I'm building a crowdsourced photo collection platform

(contributors take photos with smartphones, we auto-label

with YOLO/CLIP + enrich with 40+ metadata fields per image

including weather, time, GPS, OCR).

Before I decide what to collect first, I want to know:

what image data do YOU wish existed but doesn't?

Some ideas I'm considering:

- European street scenes (no dataset covers Switzerland/France)

- Supermarket shelves with OCR-extracted prices

- Analog utility meters

- Restaurant menus with prices

- EV charging stations by type

What would YOU actually use?

AI Business

AI Business

Reddit r/artificial

Dev.to

Dev.to