Causal Bootstrapped Alignment for Unsupervised Video-Based Visible-Infrared Person Re-Identification
arXiv cs.CV / 4/20/2026
📰 NewsIdeas & Deep AnalysisModels & Research
Key Points
- The paper targets unsupervised visible–infrared person re-identification (VVI-ReID) from unlabeled video tracklets for all-day surveillance, aiming to avoid costly cross-modality annotations required by supervised methods.
- It finds that naively extending image-based unsupervised VI-ReID approaches to video using generic pretrained encoders performs poorly due to weak identity discrimination and strong modality bias.
- To fix these problems, the proposed Causal Bootstrapped Alignment (CBA) framework uses Causal Intervention Warm-up (CIW) to suppress spurious correlations from modality and motion while preserving identity-relevant semantics.
- It further introduces Prototype-Guided Uncertainty Refinement (PGUR), a coarse-to-fine cross-modality alignment method that handles visible–infrared granularity mismatch using uncertainty-aware supervision guided by reliable visible prototypes.
- Experiments on HITSZ-VCM and BUPTCampus show CBA substantially outperforms existing unsupervised VI-ReID methods when adapted to the unsupervised video VVI-ReID setting.
Related Articles
From Theory to Reality: Why Most AI Agent Projects Fail (And How Mine Did Too)
Dev.to
GPT-5.4-Cyber: OpenAI's Game-Changer for AI Security and Defensive AI
Dev.to
Building Digital Souls: The Brutal Reality of Creating AI That Understands You Like Nobody Else
Dev.to
Local LLM Beginner’s Guide (Mac - Apple Silicon)
Reddit r/artificial
Is Your Skill Actually Good? Systematically Validating Agent Skills with Evals
Dev.to