Wearable Foundation Models Should Go Beyond Static Encoders

arXiv cs.LG / 3/23/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • Wearable foundation models (WFMs) trained on data from affordable, always-on wearables currently perform well on short-term health monitoring tasks but mostly map short temporal windows to predefined labels using static encoders, emphasizing retrospective prediction.
  • The authors argue that WFMs must move beyond static encoders to support longitudinal, anticipatory health reasoning that can model chronic, progressive, or episodic conditions unfolding over weeks, months, or years.
  • They propose three foundational shifts: structurally rich data and interoperable data ecosystems; longitudinal-aware multimodal modeling with long-context inference and personalization; and agentic inference systems that enable planning, decision-making, and clinically grounded interventions under uncertainty.
  • Implementing these shifts would reframe wearable health monitoring from retrospective signal interpretation toward continuous, proactive health support aligned with users.

Abstract

Wearable foundation models (WFMs), trained on large volumes of data collected by affordable, always-on devices, have demonstrated strong performance on short-term, well-defined health monitoring tasks, including activity recognition, fitness tracking, and cardiovascular signal assessment. However, most existing WFMs primarily map short temporal windows to predefined labels via static encoders, emphasizing retrospective prediction rather than reasoning over evolving personal history, context, and future risk trajectories. As a result, they are poorly suited for modeling chronic, progressive, or episodic health conditions that unfold over weeks, months or years. Hence, we argue that WFMs must move beyond static encoders and be explicitly designed for longitudinal, anticipatory health reasoning. We identify three foundational shifts required to enable this transition: (1) Structurally rich data, which goes beyond isolated datasets or outcome-conditioned collection to integrated multimodal, long-term personal trajectories, and contextual metadata, ideally supported by open and interoperable data ecosystems; (2) Longitudinal-aware multimodal modeling, which prioritizes long-context inference, temporal abstraction, and personalization over cross-sectional or population-level prediction; and (3) Agentic inference systems, which move beyond static prediction to support planning, decision-making, and clinically grounded intervention under uncertainty. Together, these shifts reframe wearable health monitoring from retrospective signal interpretation toward continuous, anticipatory, and human-aligned health support.