Unleashing Video Language Models for Fine-grained HRCT Report Generation
arXiv cs.CV / 3/16/2026
📰 NewsModels & Research
Key Points
- AbSteering is an abnormality-centric framework that steers Video Language Models toward precise HRCT report generation, addressing the challenges of high-volume 3D imaging and diverse pathologies.
- It combines an abnormality-centric Chain-of-Thought scheme with a Direct Preference Optimization objective that uses clinically confusable abnormalities as hard negatives to improve fine-grained discrimination.
- The approach demonstrates that general-purpose VideoLMs can transfer effectively to medical imaging when guided by this paradigm, achieving strong performance in HRCT report generation.
- It outperforms state-of-the-art domain-specific CT foundation models in detection sensitivity while reducing hallucinations, enhancing reliability for clinical reporting.
- The authors release data and model weights at the provided link, enabling broader validation and reproduction.
Related Articles

報告:LLMにおける「自己言及的再帰」と「ステートフル・エミュレーション」の観測
note

諸葛亮 孔明老師(ChatGPTのロールプレイ)との対話 その肆拾伍『銀河文明・ダークマターエンジン』
note

GPT-5.4 mini/nano登場!―2倍高速で無料プランも使える小型高性能モデル
note
Why a Perfect-Memory AI Agent Without Persona Drift is Architecturally Impossible
Dev.to
Learning to Reason with Curriculum I: Provable Benefits of Autocurriculum
arXiv cs.LG