VirPro: Visual-referred Probabilistic Prompt Learning for Weakly-Supervised Monocular 3D Detection
arXiv cs.CV / 3/19/2026
📰 NewsIdeas & Deep AnalysisModels & Research
Key Points
- VirPro introduces Visual-referred Probabilistic Prompt Learning, a multi-modal pretraining paradigm for weakly supervised monocular 3D detection that combines visual cues with learnable textual prompts.
- The method uses an Adaptive Prompt Bank (APB) to store instance-conditioned prompts and Multi-Gaussian Prompt Modeling (MGPM) to inject scene-based visual features into textual embeddings, capturing visual uncertainty.
- A RoI-level contrastive matching strategy is employed to align vision-language embeddings and tighten semantic coherence among co-occurring objects in the same scene.
- Experiments on the KITTI benchmark show consistent performance gains, achieving up to 4.8% average precision improvement over baselines.
- The work proposes a new direction for weakly supervised 3D detection by leveraging probabilistic, scene-aware prompts to better model visual diversity in real-world scenes.
Related Articles

Hey dev.to community – sharing my journey with Prompt Builder, Insta Posts, and practical SEO
Dev.to

How to Build Passive Income with AI in 2026: A Developer's Practical Guide
Dev.to

The Research That Doesn't Exist
Dev.to

Jeff Bezos reportedly wants $100 billion to buy and transform old manufacturing firms with AI
TechCrunch

Krish Naik: AI Learning Path For 2026- Data Science, Generative and Agentic AI Roadmap
Dev.to