Image Score: Learning and Evaluating Human Preferences for Mercari Search
arXiv cs.CV / 5/4/2026
💬 OpinionIdeas & Deep AnalysisTools & Practical UsageIndustry & Market MovesModels & Research
Key Points
- Mercari tackles image quality assessment in its C2C search experience by addressing the difficulty of aligning implicit feedback (e.g., image quality signals) with true human preferences.
- The company proposes a cost-efficient weak-supervision approach that uses an LLM with chain-of-thought prompting to generate image aesthetics labels that better correlate with e-commerce user behavior.
- Using LLM-produced labels improves the explainability of deep image quality evaluation, which supports customer journey optimization on Mercari.
- Experiments show that the LLM-derived labels correlate with user behavior, and online A/B-style experimentation results in significant sales growth on Mercari’s web platform.
- The approach is positioned as convenient for proof-of-concept testing because it reduces reliance on expensive explicit human judgments.
💡 Insights using this article
This article is featured in our daily AI news digest — key takeaways and action items at a glance.



