T-Gated Adapter: A Lightweight Temporal Adapter for Vision-Language Medical Segmentation
arXiv cs.CV / 4/10/2026
📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper introduces **T-Gated Adapter**, a lightweight temporal adapter designed to improve **vision-language medical segmentation** by incorporating **adjacent-slice context** rather than treating 2D slices independently.
- It injects temporal information at the **visual token level** using a temporal transformer over a fixed context window, plus a spatial refinement block and an **adaptive gating mechanism** to balance temporal vs single-slice features.
- Training on **30 labeled FLARE22 volumes** improves abdominal organ segmentation, reaching a **mean Dice of 0.704** with a **+0.206 gain** over a baseline VLM without temporal context.
- In **zero-shot cross-dataset** testing (BTCV, AMOS22), the approach shows consistent gains (**+0.210** and **+0.230**) and reduces the average cross-domain performance drop from **38.0% to 24.9%**.
- Cross-modality evaluation on **AMOS22 MRI** without MRI supervision yields **mean Dice of 0.366**, outperforming a fully supervised CT-only 3D baseline (DynUNet: **0.224**), suggesting stronger generalization of CLIP-style visual semantics across modalities.
Related Articles

Black Hat Asia
AI Business

GLM 5.1 tops the code arena rankings for open models
Reddit r/LocalLLaMA

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.
Dev.to

My Bestie Built a Free MCP Server for Job Search — Here's How It Works
Dev.to
can we talk about how AI has gotten really good at lying to you?
Reddit r/artificial