Dynamic LIBRAS Gesture Recognition via CNN over Spatiotemporal Matrix Representation

arXiv cs.AI / 3/30/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper introduces a dynamic LIBRAS (Brazilian Sign Language) gesture recognition approach that combines MediaPipe Hand Landmarker for extracting 21 hand skeletal keypoints with a CNN for classification.
Gestures are encoded as a spatiotemporal matrix of size 90×21 derived from keypoints, allowing the CNN to recognize 11 static and dynamic gesture classes.
For real-time continuous recognition, the method uses a sliding window and temporal frame triplication to avoid recurrent networks while still capturing temporal context.
Experiments report 95% accuracy in low-light conditions and 92% accuracy in normal lighting, supporting the feasibility of the approach for home automation device control.
The authors note that further systematic testing with a wider range of users is needed to better assess generalization performance across diverse populations.

Abstract

This paper proposes a method for dynamic hand gesture recognition based on the composition of two models: the MediaPipe Hand Landmarker, responsible for extracting 21 skeletal keypoints of the hand, and a convolutional neural network (CNN) trained to classify gestures from a spatiotemporal matrix representation of dimensions 90 by 21 of those keypoints. The method is applied to the recognition of LIBRAS (Brazilian Sign Language) gestures for device control in a home automation system, covering 11 classes of static and dynamic gestures. For real-time inference, a sliding window with temporal frame triplication is used, enabling continuous recognition without recurrent networks. Tests achieved 95\% accuracy under low-light conditions and 92\% under normal lighting. The results indicate that the approach is effective, although systematic experiments with greater user diversity are needed for a more thorough evaluation of generalization.

Show HN: 1-Bit Bonsai, the First Commercially Viable 1-Bit LLMs

Dev.to

I Built an AI Agent That Can Write Its Own Tools When It Gets Stuck

Dev.to

Agent Self-Discovery: How AI Agents Find Their Own Wallets

Dev.to

[P] Federated Adversarial Learning

Reddit r/MachineLearning

The Inversion Error: Why Safe AGI Requires an Enactive Floor and State-Space Reversibility

Towards Data Science

Dynamic LIBRAS Gesture Recognition via CNN over Spatiotemporal Matrix Representation

Key Points

Abstract

Related Articles

Show HN: 1-Bit Bonsai, the First Commercially Viable 1-Bit LLMs

I Built an AI Agent That Can Write Its Own Tools When It Gets Stuck

Agent Self-Discovery: How AI Agents Find Their Own Wallets

[P] Federated Adversarial Learning

The Inversion Error: Why Safe AGI Requires an Enactive Floor and State-Space Reversibility

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer