VirPro: Visual-referred Probabilistic Prompt Learning for Weakly-Supervised Monocular 3D Detection
arXiv cs.CV / 3/19/2026
📰 NewsIdeas & Deep AnalysisModels & Research
Key Points
- VirPro introduces Visual-referred Probabilistic Prompt Learning, a multi-modal pretraining paradigm for weakly supervised monocular 3D detection that combines visual cues with learnable textual prompts.
- The method uses an Adaptive Prompt Bank (APB) to store instance-conditioned prompts and Multi-Gaussian Prompt Modeling (MGPM) to inject scene-based visual features into textual embeddings, capturing visual uncertainty.
- A RoI-level contrastive matching strategy is employed to align vision-language embeddings and tighten semantic coherence among co-occurring objects in the same scene.
- Experiments on the KITTI benchmark show consistent performance gains, achieving up to 4.8% average precision improvement over baselines.
- The work proposes a new direction for weakly supervised 3D detection by leveraging probabilistic, scene-aware prompts to better model visual diversity in real-world scenes.
Related Articles
Automating the Chase: AI for Festival Vendor Compliance
Dev.to
MCP Skills vs MCP Tools: The Right Way to Configure Your Server
Dev.to
500 AI Prompts Every Content Creator Needs in 2026 (20 Free Samples)
Dev.to
Building a Game for My Daughter with AI — Part 1: What If She Could Build It Too?
Dev.to

Math needs thinking time, everyday knowledge needs memory, and a new Transformer architecture aims to deliver both
THE DECODER