MMGait: Towards Multi-Modal Gait Recognition

arXiv cs.CV / 4/20/2026

📰 NewsSignals & Early TrendsModels & Research

Key Points

  • The paper introduces MMGait, a multi-modal gait recognition benchmark designed to improve performance beyond RGB-only approaches in real-world settings.
  • MMGait integrates data from five heterogeneous sensors (RGB, depth, infrared, LiDAR, and 4D radar) and provides 12 modalities across 334,060 sequences from 725 subjects.
  • The authors evaluate single-modal, cross-modal, and multi-modal gait recognition to study each modality’s robustness and how different modalities complement one another.
  • They propose a new unified task, Omni Multi-Modal Gait Recognition, and present a baseline model (OmniGait) that learns a shared embedding space across modalities.
  • The benchmark, codebase, and pretrained checkpoints are released publicly to support systematic research and experimentation.

Abstract

Gait recognition has emerged as a powerful biometric technique for identifying individuals at a distance without requiring user cooperation. Most existing methods focus primarily on RGB-derived modalities, which fall short in real-world scenarios requiring multi-modal collaboration and cross-modal retrieval. To overcome these challenges, we present MMGait, a comprehensive multi-modal gait benchmark integrating data from five heterogeneous sensors, including an RGB camera, a depth camera, an infrared camera, a LiDAR scanner, and a 4D Radar system. MMGait contains twelve modalities and 334,060 sequences from 725 subjects, enabling systematic exploration across geometric, photometric, and motion domains. Based on MMGait, we conduct extensive evaluations on single-modal, cross-modal, and multi-modal paradigms to analyze modality robustness and complementarity. Furthermore, we introduce a new task, Omni Multi-Modal Gait Recognition, which aims to unify the above three gait recognition paradigms within a single model. We also propose a simple yet powerful baseline, OmniGait, which learns a shared embedding space across diverse modalities and achieves promising recognition performance. The MMGait benchmark, codebase, and pretrained checkpoints are publicly available at https://github.com/BNU-IVC/MMGait.