Explicit Time-Frequency Dynamics for Skeleton-Based Gait Recognition
arXiv cs.CV / 4/6/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper proposes a plug-and-play Wavelet Feature Stream that adds explicit time-frequency dynamics of joint velocities to existing skeleton-based gait recognition models.
- It converts per-joint velocity sequences into multi-scale scalograms via the continuous wavelet transform (CWT), then uses a lightweight multi-scale CNN to learn discriminative dynamic cues.
- The learned dynamic descriptor is fused with the original skeleton backbone representation for classification, without changing the backbone architecture or requiring extra supervision.
- Experiments on CASIA-B show consistent performance gains across strong skeleton backbones (GaitMixer, GaitFormer, GaitGraph), and the approach sets a new skeleton-based state of the art when combined with GaitMixer.
- The method delivers especially large improvements under covariate shifts such as carrying bags (BG) and wearing coats (CL), indicating that explicit time-frequency modeling complements spatio-temporal encoders.




