A Multi-Modal CNN-LSTM Framework with Multi-Head Attention and Focal Loss for Real-Time Elderly Fall Detection

arXiv cs.AI / 3/25/2026

💬 OpinionSignals & Early TrendsModels & Research

共有:

Key Points

The paper presents MultiModalFallDetector, a multi-modal wearable-sensor deep learning framework for real-time elderly fall detection using tri-axial accelerometer, gyroscope, and multi-channel physiological signals.
It combines a multi-scale CNN feature extractor, multi-head self-attention for dynamic temporal weighting, and an auxiliary activity classification task to regularize training.
To address class imbalance common in fall datasets, the method uses Focal Loss and applies transfer learning from UCI HAR to the SisFall dataset.
Experiments on SisFall report strong performance (F1 98.7, Recall 98.9, AUC-ROC 99.4) and demonstrate low-latency inference (under 50ms) suitable for edge deployment in geriatric care.

Abstract

The increasing global aging population has intensified the demand for reliable health monitoring systems, particularly those capable of detecting critical events such as falls among elderly individuals. Traditional fall detection approaches relying on single-modality acceleration data suffer from high false alarm rates, while conventional machine learning methods require extensive hand-crafted feature engineering. This paper proposes a novel multi-modal deep learning framework, MultiModalFallDetector, designed for real-time elderly fall detection using wearable sensors. Our approach integrates multiple innovations: a multi-scale CNN-based feature extractor capturing motion dynamics at varying temporal resolutions; fusion of tri-axial accelerometer, gyroscope, and four-channel physiological signals; incorporation of a multi-head self-attention mechanism for dynamic temporal weighting; adoption of Focal Loss to mitigate severe class imbalance; introduction of an auxiliary activity classification task for regularization; and implementation of transfer learning from UCI HAR to SisFall dataset. Extensive experiments on the SisFall dataset, which includes real-world simulated fall trials from elderly participants (aged 60-85), demonstrate that our framework achieves an F1-score of 98. 7, Recall of 98. 9, and AUC-ROC of 99. 4, significantly outperforming baseline methods including traditional machine learning and standard deep learning approaches. The model maintains sub- 50ms inference latency on edge devices, confirming its suitability for real-time deployment in geriatric care settings.

5 Signs Your Consulting Firm Needs AI Agents (Not More Staff)

Dev.to

When should we expect TurboQuant?

Reddit r/LocalLLaMA

The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions

Dev.to

Stop Writing Proposals by Hand: How AI Agents Generate Winning Proposals in 30 Seconds

Dev.to

Meta just acqui-hired its 4th AI startup in 4 months. Dreamer, Manus, Moltbook, and Scale AI's founder. Is anyone else watching this pattern?

Reddit r/artificial

A Multi-Modal CNN-LSTM Framework with Multi-Head Attention and Focal Loss for Real-Time Elderly Fall Detection

Key Points

Abstract

Related Articles

5 Signs Your Consulting Firm Needs AI Agents (Not More Staff)

When should we expect TurboQuant?

The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions

Stop Writing Proposals by Hand: How AI Agents Generate Winning Proposals in 30 Seconds

Meta just acqui-hired its 4th AI startup in 4 months. Dreamer, Manus, Moltbook, and Scale AI's founder. Is anyone else watching this pattern?

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer