Detecting mirrored selfie images: OCR the best way? [D]

Reddit r/MachineLearning / 4/10/2026

💬 OpinionIdeas & Deep AnalysisTools & Practical Usage

共有:

Key Points

The post discusses detecting mirrored (backwards) selfie images so they can be corrected before sending them to vision-language and face-embedding pipelines.
The author argues that OCR-based detection (e.g., running EasyOCR on text crops and comparing normal vs flipped read confidence) could help decide whether an image is mirrored.
The core question is whether the “OCR score trick” is truly optimal or if there is a more accurate and lightweight alternative (e.g., a smarter small model) for mirror detection.
The motivation highlights that some models (like Qwen/Florence) may already be robust to flipped/augmented training data, making prompt-based approaches less effective.
The thread seeks practical guidance on building a low-compute mirror-detection component that improves downstream recognition quality.

I'm trying to catch backwards "selfie" images before passing them to our VLM text reader and/or face embedding extraction. Since models like Qwen and Florence are trained on flipped data, they are mostly blind to backwards text and prompting them just seems to be fighting against their base training (i'm assuming they used lots of augmented flipped training data). My best idea right now is to run EasyOCR on the text crops and see if the normal or flipped version gets a higher read score. Is this OCR score trick really the best way to handle this, or is there a smart, small model approach I'm missing?

submitted by /u/dangerousdotnet
[link] [comments]