Purify-then-Align: Towards Robust Human Sensing under Modality Missing with Knowledge Distillation from Noisy Multimodal Teacher
arXiv cs.CV / 4/8/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper addresses robust multimodal human sensing under missing modalities by identifying two linked causes: a representation gap across heterogeneous inputs and contamination from low-quality modalities.
- It proposes the PTA (Purify-then-Align) framework that first purifies modality signals using meta-learning to dynamically down-weight noisy, low-contributing modalities.
- PTA then aligns modalities via diffusion-based knowledge distillation, using a clean, information-rich teacher derived from the purified consensus to refine student modality features.
- Experiments on MM-Fi and XRF55 under strong representation gap and contamination conditions show state-of-the-art results and improved robustness for single-modality encoders across missing-modality scenarios.
Related Articles

Black Hat Asia
AI Business
[N] Just found out that Milla Jovovich is a dev, invested in AI, and just open sourced a project
Reddit r/MachineLearning

ALTK‑Evolve: On‑the‑Job Learning for AI Agents
Hugging Face Blog

Context Windows Are Getting Absurd — And That's a Good Thing
Dev.to

Every AI Agent Registry in 2026, Compared
Dev.to