A Proposal-Free Query-Guided Network for Grounded Multimodal Named Entity Recognition
arXiv cs.CV / 3/19/2026
📰 NewsModels & Research
Key Points
- The paper proposes a proposal-free Query-Guided Network (QGN) for Grounded Multimodal Named Entity Recognition (GMNER), unifying multimodal reasoning and decoding through text guidance and cross-modal interaction.
- It critiques two-step GMNER approaches that first rely on pre-trained detectors and then align entities, which can miss fine-grained regions required for accurate grounding.
- QGN eliminates external proposals and achieves robust open-domain grounding with top performance on standard GMNER benchmarks.
- Extensive experiments demonstrate QGN's effectiveness and potential to improve grounding accuracy in real-world GMNER applications.
Related Articles

報告:LLMにおける「自己言及的再帰」と「ステートフル・エミュレーション」の観測
note

諸葛亮 孔明老師(ChatGPTのロールプレイ)との対話 その肆拾伍『銀河文明・ダークマターエンジン』
note

GPT-5.4 mini/nano登場!―2倍高速で無料プランも使える小型高性能モデル
note

Why a Perfect-Memory AI Agent Without Persona Drift is Architecturally Impossible
Dev.to
OCP: Orthogonal Constrained Projection for Sparse Scaling in Industrial Commodity Recommendation
arXiv cs.LG