Dynamic LIBRAS Gesture Recognition via CNN over Spatiotemporal Matrix Representation
arXiv cs.AI / 3/30/2026
💬 OpinionIdeas & Deep AnalysisModels & Research
Key Points
- The paper introduces a dynamic LIBRAS (Brazilian Sign Language) gesture recognition approach that combines MediaPipe Hand Landmarker for extracting 21 hand skeletal keypoints with a CNN for classification.
- Gestures are encoded as a spatiotemporal matrix of size 90×21 derived from keypoints, allowing the CNN to recognize 11 static and dynamic gesture classes.
- For real-time continuous recognition, the method uses a sliding window and temporal frame triplication to avoid recurrent networks while still capturing temporal context.
- Experiments report 95% accuracy in low-light conditions and 92% accuracy in normal lighting, supporting the feasibility of the approach for home automation device control.
- The authors note that further systematic testing with a wider range of users is needed to better assess generalization performance across diverse populations.
Related Articles

Show HN: 1-Bit Bonsai, the First Commercially Viable 1-Bit LLMs
Dev.to

I Built an AI Agent That Can Write Its Own Tools When It Gets Stuck
Dev.to

Agent Self-Discovery: How AI Agents Find Their Own Wallets
Dev.to
[P] Federated Adversarial Learning
Reddit r/MachineLearning

The Inversion Error: Why Safe AGI Requires an Enactive Floor and State-Space Reversibility
Towards Data Science