Improving Generalization of Deep Learning for Brain Metastases Segmentation Across Institutions

arXiv cs.AI / 4/2/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

共有:

Key Points

The study addresses a key limitation of deep learning for brain metastases (BM) segmentation: models trained on a single institution can underperform elsewhere due to differences in scanners, imaging protocols, and demographics.
It proposes a domain-adaptation pipeline (VAE-MMD preprocessing) that aligns features across institutions by combining a variational autoencoder with maximum mean discrepancy (MMD) loss, and then running nnU-Net for segmentation with architectural enhancements like skip connections and self-attention.
Evaluated on 740 patients from four public datasets (Stanford, UCSF, UCLM, PKG), the approach substantially improves cross-site performance without requiring target-domain labels.
The results show stronger segmentation quality across volumetric, detection, and boundary metrics, including a reported ~11.1% increase in mean F1, ~7.93% increase in mean surface Dice, and ~65.5% reduction in mean HD95 versus baseline nnU-Net.
Feature alignment is supported by a major drop in domain classifier accuracy (0.91 to 0.50), suggesting the method successfully reduces cross-institution heterogeneity in learned representations.

Abstract

Background: Deep learning has demonstrated significant potential for automated brain metastases (BM) segmentation; however, models trained at a singular institution often exhibit suboptimal performance at various sites due to disparities in scanner hardware, imaging protocols, and patient demographics. The goal of this work is to create a domain adaptation framework that will allow for BM segmentation to be used across multiple institutions. Methods: We propose a VAE-MMD preprocessing pipeline that combines variational autoencoders (VAE) with maximum mean discrepancy (MMD) loss, incorporating skip connections and self-attention mechanisms alongside nnU-Net segmentation. The method was tested on 740 patients from four public databases: Stanford, UCSF, UCLM, and PKG, evaluated by domain classifier's accuracy, sensitivity, precision, F1/F2 scores, surface Dice (sDice), and 95th percentile Hausdorff distance (HD95). Results: VAE-MMD reduced domain classifier accuracy from 0.91 to 0.50, indicating successful feature alignment across institutions. Reconstructed volumes attained a PSNR greater than 36 dB, maintaining anatomical accuracy. The combined method raised the mean F1 by 11.1% (0.700 to 0.778), the mean sDice by 7.93% (0.7121 to 0.7686), and reduced the mean HD95 by 65.5% (11.33 to 3.91 mm) across all four centers compared to the baseline nnU-Net. Conclusions: VAE-MMD effectively diminishes cross-institutional data heterogeneity and enhances BM segmentation generalization across volumetric, detection, and boundary-level metrics without necessitating target-domain labels, thereby overcoming a significant obstacle to the clinical implementation of AI-assisted segmentation.

Benchmarking Batch Deep Reinforcement Learning Algorithms

Dev.to

Qwen3.6-Plus: Alibaba's Quiet Giant in the AI Race Delivers a Million-Token Enterprise Powerhouse

Dev.to

How To Leverage AI for Back-Office Headcount Optimization

Dev.to

Is 1-bit and TurboQuant the future of OSS? A simulation for Qwen3.5 models.

Reddit r/LocalLLaMA

SOTA Language Models Under 14B?

Reddit r/LocalLLaMA

Improving Generalization of Deep Learning for Brain Metastases Segmentation Across Institutions

Key Points

Abstract

Related Articles

Benchmarking Batch Deep Reinforcement Learning Algorithms

Qwen3.6-Plus: Alibaba's Quiet Giant in the AI Race Delivers a Million-Token Enterprise Powerhouse

How To Leverage AI for Back-Office Headcount Optimization

Is 1-bit and TurboQuant the future of OSS? A simulation for Qwen3.5 models.

SOTA Language Models Under 14B?

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer