Unsupervised Skeleton-Based Action Segmentation via Hierarchical Spatiotemporal Vector Quantization
arXiv cs.CV / 4/17/2026
📰 NewsIdeas & Deep AnalysisModels & Research
Key Points
- The paper introduces a hierarchical spatiotemporal vector quantization framework for unsupervised skeleton-based temporal action segmentation.
- It uses two consecutive quantization levels: a lower level captures fine-grained subactions and a higher level aggregates them into action-level representations.
- The method first shows strong results by primarily exploiting spatial cues through reconstruction of input skeletons, then improves by incorporating both spatial and temporal information.
- The extended hierarchical spatiotemporal version performs multi-level clustering while also reconstructing the skeleton inputs and their corresponding timestamps.
- Experiments on HuGaDB, LARa, and BABEL report new state-of-the-art performance and reduced segment-length bias in unsupervised action segmentation.
Related Articles
langchain-anthropic==1.4.1
LangChain Releases

🚀 Anti-Gravity Meets Cloud AI: The Future of Effortless Development
Dev.to

Talk to Your Favorite Game Characters! Mantella Brings AI to Skyrim and Fallout 4 NPCs
Dev.to

AI Will Run Companies. Here's Why That Should Excite You, Not Scare You.
Dev.to

The problem with Big Tech AI pricing (and why 8 countries can't afford to compete)
Dev.to