CataractSAM-2: A Domain-Adapted Model for Anterior Segment Surgery Segmentation and Scalable Ground-Truth Annotation

arXiv cs.RO / 2026/3/24

💬 オピニオンSignals & Early TrendsTools & Practical UsageModels & Research

共有:

要点

CataractSAM-2は、MetaのSAM 2を眼科手術領域向けにドメイン適応したモデルで、白内障手術動画に対するリアルタイムで高精度な意味的セグメンテーションを目指しています。
手術用ロボティクス／コンピュータ支援手術に必要な術中認識を補強することを目的としており、コンピュータビジョンと医療ロボティクスの接点で位置づけられています。
手作業ラベリングの負担を減らすため、疎なプロンプトと動画のマスク伝播を組み合わせた対話型アノテーション枠組みを提案し、高品質なグラウンドトゥルース生成をスケールしやすくしています。
グローコーマ手術（トラベクロトミー）へのゼロショット汎化も示され、手術手技をまたいだ有用性の可能性を示唆しています。
学習済みモデルとアノテーションツールキットをオープンソースとして公開し、前眼部手術データセット拡充と医療AIの実運用開発を促進します。

Abstract

We present CataractSAM-2, a domain-adapted extension of Meta's Segment Anything Model 2, designed for real-time semantic segmentation of cataract ophthalmic surgery videos with high accuracy. Positioned at the intersection of computer vision and medical robotics, CataractSAM-2 enables precise intraoperative perception crucial for robotic-assisted and computer-guided surgical systems. Furthermore, to alleviate the burden of manual labeling, we introduce an interactive annotation framework that combines sparse prompts with video-based mask propagation. This tool significantly reduces annotation time and facilitates the scalable creation of high-quality ground-truth masks, accelerating dataset development for ocular anterior segment surgeries. We also demonstrate the model's strong zero-shot generalization to glaucoma trabeculectomy procedures, confirming its cross-procedural utility and potential for broader surgical applications. The trained model and annotation toolkit are released as open-source resources, establishing CataractSAM-2 as a foundation for expanding anterior ophthalmic surgical datasets and advancing real-time AI-driven solutions in medical robotics, as well as surgical video understanding.

💡 この記事が使われたインサイト

AIの最新ニュースをまとめた「今日の要点」で、この記事が取り上げられています。

📅 3/24Dailyインサイトを見る →

光電融合の製造受託に野心、新光電気「TSMCにはない魅力を」

日経XTECH

日立製作所と日立エナジー、エネルギーインフラ向けAIサービスを提供

日経XTECH

マイクロソフト、Claude CodeやGitHub Copilotに「このアプリをデプロイせよ」と指示すればAIが最適なインフラ構成やサービスでデプロイしてくれる「Azure Skills Plugin」公開

Publickey

[野球の予測モデル] 次の1球で何が起こるのかを予測したい

Qiita

なんと397BのAIモデルをiPhoneで動かすことに成功

GIGAZINE

CataractSAM-2: A Domain-Adapted Model for Anterior Segment Surgery Segmentation and Scalable Ground-Truth Annotation

要点

Abstract

💡 この記事が使われたインサイト

関連記事

光電融合の製造受託に野心、新光電気「TSMCにはない魅力を」

日立製作所と日立エナジー、エネルギーインフラ向けAIサービスを提供

マイクロソフト、Claude CodeやGitHub Copilotに「このアプリをデプロイせよ」と指示すればAIが最適なインフラ構成やサービスでデプロイしてくれる「Azure Skills Plugin」公開

[野球の予測モデル] 次の1球で何が起こるのかを予測したい

なんと397BのAIモデルをiPhoneで動かすことに成功

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer