When Choices Become Priors: Contrastive Decoding for Scientific Figure Multiple-Choice QA
arXiv cs.AI / 3/31/2026
💬 OpinionIdeas & Deep AnalysisModels & Research
Key Points
- Scientific figure multiple-choice QA can fail because the answer-choice text functions as a prior, biasing multimodal models toward scientifically plausible options even when the figure indicates otherwise.
- The paper introduces SCICON, a training-free contrastive decoding method that scores each candidate by subtracting its text-only score from its image-conditioned score to discount choice-induced priors.
- SCICON differs from earlier contrastive decoding methods by focusing specifically on priors embedded in the candidate text rather than contrasting inputs or perturbing instructions.
- Experiments across three scientific figure QA benchmarks and three model backbones show consistent accuracy improvements versus standard decoding baselines.
- The findings suggest that explicitly decoding against choice-induced priors is a straightforward way to improve figure-grounded reasoning in scientific MCQA.
Related Articles
[D] How does distributed proof of work computing handle the coordination needs of neural network training?
Reddit r/MachineLearning

BYOK is not just a pricing model: why it changes AI product trust
Dev.to

AI Citation Registries and Identity Persistence Across Records
Dev.to

Building Real-Time AI Voice Agents with Google Gemini 3.1 Flash Live and VideoSDK
Dev.to

Your Knowledge, Your Model: A Method for Deterministic Knowledge Externalization
Dev.to