Motion-Adaptive Multi-Scale Temporal Modelling with Skeleton-Constrained Spatial Graphs for Efficient 3D Human Pose Estimation

arXiv cs.CV / 4/7/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper introduces MASC-Pose, an efficient 3D human pose estimation framework for monocular videos that targets both spatial and temporal dependency modeling challenges.
It uses an Adaptive Multi-scale Temporal Modelling (AMTM) module to capture different motion dynamics across temporal scales in a motion-adaptive way.
For spatial reasoning, it proposes a Skeleton-constrained Adaptive GCN (SAGCN) that models joint-specific interactions while leveraging skeletal structure constraints.
Experiments on Human3.6M and MPI-INF-3DHP show that the approach improves accuracy while maintaining high computational efficiency compared with fixed or dense-attention-heavy schemes.

Abstract

Accurate 3D human pose estimation from monocular videos requires effective modelling of complex spatial and temporal dependencies. However, existing methods often face challenges in efficiency and adaptability when modelling spatial and temporal dependencies, particularly under dense attention or fixed modelling schemes. In this work, we propose MASC-Pose, a Motion-Adaptive multi-scale temporal modelling framework with Skeleton-Constrained spatial graphs for efficient 3D human pose estimation. Specifically, it introduces an Adaptive Multi-scale Temporal Modelling (AMTM) module to adaptively capture heterogeneous motion dynamics at different temporal scales, together with a Skeleton-constrained Adaptive GCN (SAGCN) for joint-specific spatial interaction modelling. By jointly enabling adaptive temporal reasoning and efficient spatial aggregation, our method achieves strong accuracy with high computational efficiency. Extensive experiments on Human3.6M and MPI-INF-3DHP datasets demonstrate the effectiveness of our approach.

[R] The ECIH: Model Modeling Agentic Identity as an Emergent Relational State [R]

Reddit r/MachineLearning

Google DeepMind Unveils Project Genie: The Dawn of Infinite AI-Generated Game Worlds

Dev.to

Artificial Intelligence and Life in 2030: The One Hundred Year Study onArtificial Intelligence

Dev.to

From Booth Chaos to Scalable Conversations: AI for Hyper-Personalized Follow-Up

Dev.to

AI in 2030: 20 Powerful Trends That Will Shape the Future

Dev.to

Motion-Adaptive Multi-Scale Temporal Modelling with Skeleton-Constrained Spatial Graphs for Efficient 3D Human Pose Estimation

Key Points

Abstract

Related Articles

[R] The ECIH: Model Modeling Agentic Identity as an Emergent Relational State [R]

Google DeepMind Unveils Project Genie: The Dawn of Infinite AI-Generated Game Worlds

Artificial Intelligence and Life in 2030: The One Hundred Year Study onArtificial Intelligence

From Booth Chaos to Scalable Conversations: AI for Hyper-Personalized Follow-Up

AI in 2030: 20 Powerful Trends That Will Shape the Future

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer