PC-MNet: Dual-Level Congruity Modeling for Multimodal Sarcasm Detection via Polarity-Modulated Attention
arXiv cs.CL / 5/5/2026
📰 NewsModels & Research
Key Points
- PC-MNet proposes a new multimodal sarcasm detection model that targets pragmatic incongruities between literal text and nonverbal cues.
- Instead of using similarity-based attention and uniform late fusion, it introduces a scalar congruity routing mechanism and a prior-guided contextual graph to better handle functional entanglement.
- The model uses a two-stage asymmetric optimization with inconsistency-aware contrastive learning to form a generalized incongruity manifold and to fuse only the most discriminative evidence across multiple granularities.
- Experiments on the MUStARD benchmark and spurious-correlation-mitigated balanced datasets show new state-of-the-art results, improving Macro-F1 by 3.14% over the strongest prior multimodal baseline.
- The approach aims to architecturally isolate conflicts at atomic, compositional, and contextual levels to more robustly capture subtle pragmatic mismatches in human communication.
Related Articles

Singapore's Fraud Frontier: Why AI Scam Detection Demands Regulatory Precision
Dev.to
From OOM to 262K Context: Running Qwen3-Coder 30B Locally on 8GB VRAM
Dev.to

Nano Banana Pro vs DALL-E 3 vs Midjourney: A Practical Comparison From Someone Who Actually Uses All Three
Dev.to
LLMs edited 86 human essays toward a semantic cluster not occupied by any human writer [D]
Reddit r/MachineLearning

Fake News Detection using Machine Learning & NLP!
Dev.to