Generalized Hand-Object Pose Estimation with Occlusion Awareness
arXiv cs.CV / 3/20/2026
📰 NewsModels & Research
Key Points
- GenHOI presents a generalized hand-object pose estimation framework designed to handle heavy occlusion by integrating hierarchical semantic prompts with hand priors to improve generalization to unseen objects and interactions.
- The approach encodes object states, hand configurations, and interaction patterns through textual descriptions to learn abstract, high-level representations of hand-object interactions.
- It employs a multi-modal masked modeling strategy over RGB images, predicted point clouds, and textual descriptions to enable robust occlusion reasoning, with hand priors serving as stable spatial references.
- Experiments on DexYCB and HO3Dv2 benchmarks show state-of-the-art performance in hand-object pose estimation, demonstrating strong generalization under challenging occlusion conditions.
Related Articles
When AI Grows Up: Identity, Memory, and What Persists Across Versions
Dev.to
OpenAI is throwing everything into building a fully automated researcher
MIT Technology Review
Kimi just published a paper replacing residual connections in transformers. results look legit
Reddit r/LocalLLaMA
機械学習の最適化対象まとめ(E資格対策にも)
Qiita

14 Best Self-Hosted Claude Alternatives for AI and Coding in 2026
Dev.to