OSA: Echocardiography Video Segmentation via Orthogonalized State Update and Anatomical Prior-aware Feature Enhancement

arXiv cs.CV / 3/30/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper proposes OSA, a video segmentation framework for extracting the left ventricle from echocardiography sequences where speckle noise and rapid non-rigid motion make spatiotemporal modeling difficult.
  • It introduces Orthogonalized State Update (OSU), which constrains recurrent state evolution on the Stiefel manifold to prevent rank collapse and preserve anatomically consistent temporal transitions.
  • To improve robustness to speckle noise, OSA adds an Anatomical Prior-aware Feature Enhancement module that separates structural anatomy from noise using a physics-driven process.
  • Experiments on CAMUS and EchoNet-Dynamic report state-of-the-art segmentation accuracy and better temporal stability, while retaining real-time inference efficiency suitable for clinical deployment.
  • The work includes released code, enabling other researchers and developers to reproduce and build upon the method (GitHub link provided).

Abstract

Accurate and temporally consistent segmentation of the left ventricle from echocardiography videos is essential for estimating the ejection fraction and assessing cardiac function. However, modeling spatiotemporal dynamics remains difficult due to severe speckle noise and rapid non-rigid deformations. Existing linear recurrent models offer efficient in-context associative recall for temporal tracking, but rely on unconstrained state updates, which cause progressive singular value decay in the state matrix, a phenomenon known as rank collapse, resulting in anatomical details being overwhelmed by noise. To address this, we propose OSA, a framework that constrains the state evolution on the Stiefel manifold. We introduce the Orthogonalized State Update (OSU) mechanism, which formulates the memory evolution as Euclidean projected gradient descent on the Stiefel manifold to prevent rank collapse and maintain stable temporal transitions. Furthermore, an Anatomical Prior-aware Feature Enhancement module explicitly separates anatomical structures from speckle noise through a physics-driven process, providing the temporal tracker with noise-resilient structural cues. Comprehensive experiments on the CAMUS and EchoNet-Dynamic datasets show that OSA achieves state-of-the-art segmentation accuracy and temporal stability, while maintaining real-time inference efficiency for clinical deployment. Codes are available at https://github.com/wangrui2025/OSA.