Evaluating Large Language Models for Gait Classification Using Text-Encoded Kinematic Waveforms

arXiv cs.LG / 3/17/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The study evaluated whether general-purpose LLMs can classify continuous gait kinematics when encoded as textual numeric sequences and compared their performance to traditional classifiers (KNN and OCSVM) using Leave-One-Subject-Out cross-validation.
The supervised KNN achieved the highest multiclass MCC of 0.88, outperforming the zero-shot LLMs.
GPT-5 with reference grounding reached a multiclass MCC of 0.70 and a binary MCC of 0.68, still below the KNN and above the class-independent OCSVM.
Using high-confidence predictions increased the LLM multiclass MCC to 0.83 on the filtered subset, indicating sensitivity to confidence thresholds.
The o4-mini model performed comparably to larger models, highlighting computational efficiency and suggesting LLMs may be more suitable for exploratory analysis rather than direct diagnostic use.

Abstract

Background: Machine learning (ML) enhances gait analysis but often lacks the level of interpretability desired for clinical adoption. Large Language Models (LLMs) may offer explanatory capabilities and confidence-aware outputs when applied to structured kinematic data. This study therefore evaluated whether general-purpose LLMs can classify continuous gait kinematics when represented as textual numeric sequences and how their performance compares to conventional ML approaches. Methods: Lower-body kinematics were recorded from 20 participants performing seven gait patterns. A supervised KNN classifier and a class-independent One-Class SVM (OCSVM) were compared against zero-shot LLMs (GPT-5, GPT-5-mini, GPT-4.1, and o4-mini). Models were evaluated using Leave-One-Subject-Out (LOSO) cross-validation. LLMs were tested both with and without explicit reference gait statistics. Results: The supervised KNN achieved the highest performance (multiclass Matthews Correlation Coefficient, MCC = 0.88). The best-performing LLM (GPT-5) with reference grounding achieved a multiclass MCC of 0.70 and a binary MCC of 0.68, outperforming the class-independent OCSVM (binary MCC = 0.60). Performance of the LLM was highly dependent on explicit reference information and self-rated confidence; when restricted to high-confidence predictions, multiclass MCC increased to 0.83 on the filtered subset. Notably, the computationally efficient o4-mini model performed comparably to larger models. Conclusion: When continuous kinematic waveforms were encoded as textual numeric tokens, general-purpose LLMs, even with reference grounding, did not match supervised multiclass classifiers for precise gait classification and are better regarded as exploratory systems requiring cautious, human-guided interpretation rather than diagnostic use.

Hey dev.to community – sharing my journey with Prompt Builder, Insta Posts, and practical SEO

Dev.to

How to Build Passive Income with AI in 2026: A Developer's Practical Guide

Dev.to

The Research That Doesn't Exist

Dev.to

ベテランの若手育成負担を減らせ、PLC制御の「ラダー図」をAIで生成

日経XTECH

Jeff Bezos reportedly wants $100 billion to buy and transform old manufacturing firms with AI

TechCrunch

Evaluating Large Language Models for Gait Classification Using Text-Encoded Kinematic Waveforms

Key Points

Abstract

Related Articles

Hey dev.to community – sharing my journey with Prompt Builder, Insta Posts, and practical SEO

How to Build Passive Income with AI in 2026: A Developer's Practical Guide

The Research That Doesn't Exist

ベテランの若手育成負担を減らせ、PLC制御の「ラダー図」をAIで生成

Jeff Bezos reportedly wants $100 billion to buy and transform old manufacturing firms with AI

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer