Agentic AI for Personalized Physiotherapy: A Multi-Agent Framework for Generative Video Training and Real-Time Pose Correction

arXiv cs.AI / 4/25/2026

📰 NewsDeveloper Stack & InfrastructureSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper addresses low at-home physiotherapy adherence by proposing a personalized, dynamically supervised tele-rehabilitation loop instead of relying on static exercise videos or generic avatars.
  • It introduces a multi-agent system (MAS) with four specialized micro-agents: clinical constraint extraction from notes, generative video synthesis for patient-specific exercises, real-time pose estimation, and diagnostic feedback with corrective instructions.
  • The framework combines generative AI for exercise video creation with computer-vision-based pose estimation to tailor training to an individual’s injury limitations and home context.
  • The authors describe the system architecture and prototype pipeline using Large Language Models and MediaPipe, and they outline a clinical evaluation plan to assess feasibility and safety.
  • Overall, the work argues that agentic autonomous decision-making paired with generative media could help scale personalized physiotherapy more effectively.

Abstract

At-home physiotherapy compliance remains critically low due to a lack of personalized supervision and dynamic feedback. Existing digital health solutions rely on static, pre-recorded video libraries or generic 3D avatars that fail to account for a patient's specific injury limitations or home environment. In this paper, we propose a novel Multi-Agent System (MAS) architecture that leverages Generative AI and computer vision to close the tele-rehabilitation loop. Our framework consists of four specialized micro-agents: a Clinical Extraction Agent that parses unstructured medical notes into kinematic constraints; a Video Synthesis Agent that utilizes foundational video generation models to create personalized, patient-specific exercise videos; a Vision Processing Agent for real-time pose estimation; and a Diagnostic Feedback Agent that issues corrective instructions. We present the system architecture, detail the prototype pipeline using Large Language Models and MediaPipe, and outline our clinical evaluation plan. This work demonstrates the feasibility of combining generative media with agentic autonomous decision-making to scale personalized patient care safely and effectively.