MedSAD-CLIP: Supervised CLIP with Token-Patch Cross-Attention for Medical Anomaly Detection and Segmentation
arXiv cs.CV / 3/19/2026
📰 NewsModels & Research
Key Points
- MedSAD-CLIP introduces a supervised adaptation of CLIP for medical anomaly detection and segmentation using Token-Patch Cross-Attention to improve lesion localization while preserving CLIP's generalization.
- The approach uses lightweight image adapters and learnable prompt tokens to efficiently tailor the pretrained CLIP encoder to the medical domain with a limited amount of labeled abnormal data.
- A Margin-based image-text Contrastive Loss is proposed to enhance discrimination between normal and abnormal representations at the global feature level.
- Experiments on four datasets (Brain, Retina, Lung, Breast) show superior pixel-level segmentation and image-level classification compared with state-of-the-art methods, with code to be released.
Related Articles
Co-Activation Pattern Detection for Prompt Injection: A Mechanistic Interpretability Approach Using Sparse Autoencoders
Reddit r/LocalLLaMA

How to Train Custom Language Models: Fine-Tuning vs Training From Scratch (2026)
Dev.to

KoboldCpp 1.110 - 3 YR Anniversary Edition, native music gen, qwen3tts voice cloning and more
Reddit r/LocalLLaMA
Qwen3.5 Knowledge density and performance
Reddit r/LocalLLaMA
I think I made the best general use System Prompt for Qwen 3.5 (OpenWebUI + Web search)
Reddit r/LocalLLaMA