Low-Rank-Modulated Functa: Exploring the Latent Space of Implicit Neural Representations for Interpretable Ultrasound Video Analysis

arXiv cs.CV / 3/30/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • 研究はFuncta系のimplicit neural representations(INRs)で用いられる潜在変調ベクトルの「構造」と「解釈可能性」が未解明である点に着目し、超音波動画向けに潜在空間を調べた。
  • 提案手法Low-Rank-Modulated Functa(LRM-Functa)は、時間解像の潜在空間で変調ベクトルに低ランク制約を課し、心臓超音波で潜在空間が周期的で整った軌道を持つことを示した。
  • 潜在空間のトラバースにより心周期に沿った滑らかなフレーム生成が可能になり、追加学習なしに拡張末期(ED)・収縮末期(ES)フレームを直接読み出せると報告している。
  • LRM-Functaは教師なしED/ES検出で既存手法より良い性能を示し、各フレームをランクk=2まで圧縮しても、駆出率(ejection fraction)予測などの下流性能を大きく損なわないとされる。
  • 心臓以外にも、心臓point-of-careデータでのOOD(分布外)フレーム選択や、肺超音波のB-line分類で一般化性を評価している。

Abstract

Implicit neural representations (INRs) have emerged as a powerful framework for continuous image representation learning. In Functa-based approaches, each image is encoded as a latent modulation vector that conditions a shared INR, enabling strong reconstruction performance. However, the structure and interpretability of the corresponding latent spaces remain largely unexplored. In this work, we investigate the latent space of Functa-based models for ultrasound videos and propose Low-Rank-Modulated Functa (LRM-Functa), a novel architecture that enforces a low-rank adaptation of modulation vectors in the time-resolved latent space. When applied to cardiac ultrasound, the resulting latent space exhibits clearly structured periodic trajectories, facilitating visualization and interpretability of temporal patterns. The latent space can be traversed to sample novel frames, revealing smooth transitions along the cardiac cycle, and enabling direct readout of end-diastolic (ED) and end-systolic (ES) frames without additional model training. We show that LRM-Functa outperforms prior methods in unsupervised ED and ES frame detection, while compressing each video frame to as low as rank k=2 without sacrificing competitive downstream performance on ejection fraction prediction. Evaluations on out-of-distribution frame selection in a cardiac point-of-care dataset, as well as on lung ultrasound for B-line classification, demonstrate the generalizability of our approach. Overall, LRM-Functa provides a compact, interpretable, and generalizable framework for ultrasound video analysis. The code is available at https://github.com/JuliaWolleb/LRM_Functa.