Continual Multimodal Egocentric Activity Recognition via Modality-Aware Novel Detection
arXiv cs.CV / 3/19/2026
📰 NewsIdeas & Deep AnalysisModels & Research
Key Points
- The authors propose MAND, a modality-aware framework for multimodal egocentric open-world continual learning to detect novel activities while learning from non-stationary streams.
- It introduces Modality-aware Adaptive Scoring (MoAS) to estimate sample-wise modality reliability from energy scores and adaptively fuse modality logits to better exploit cues from multiple modalities, especially IMU.
- During training, Modality-wise Representation Stabilization Training (MoRST) preserves modality-specific discriminability across tasks via auxiliary heads and modality-wise logit distillation.
- The approach addresses RGB-dominated logits and underutilized IMU cues, mitigating catastrophic forgetting in open-world settings.
- Experiments on a public multimodal egocentric benchmark show up to 10% improvement in novel activity detection AUC and up to 2.8% improvement in known-class accuracy over state-of-the-art baselines.
Related Articles
The programming passion is melting
Dev.to
Maximize Developer Revenue with Monetzly's Innovative API for AI Conversations
Dev.to
Co-Activation Pattern Detection for Prompt Injection: A Mechanistic Interpretability Approach Using Sparse Autoencoders
Reddit r/LocalLLaMA

How to Train Custom Language Models: Fine-Tuning vs Training From Scratch (2026)
Dev.to

KoboldCpp 1.110 - 3 YR Anniversary Edition, native music gen, qwen3tts voice cloning and more
Reddit r/LocalLLaMA