MAESIL: Masked Autoencoder for Enhanced Self-supervised Medical Image Learning

arXiv cs.CV / 4/2/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper introduces MAESIL, a new self-supervised learning framework for 3D medical imaging (especially CT) that targets the lack of labeled data.
It argues that common SSL approaches degrade 3D structural learning by treating CT volumes as independent 2D slices, discarding axial coherence and spatial context.
MAESIL’s key contribution is the “superpatch,” a 3D chunk-based input unit that aims to preserve 3D context while keeping computation manageable.
The method uses a 3D masked autoencoder with a dual-masking strategy to learn richer spatial representations from unlabeled scans.
Experiments on three large public CT datasets show MAESIL improves reconstruction quality (e.g., PSNR and SSIM) over baselines like AE, VAE, and VQ-VAE, positioning it as a practical pre-training option for downstream 3D tasks.

Abstract

Training deep learning models for three-dimensional (3D) medical imaging, such as Computed Tomography (CT), is fundamentally challenged by the scarcity of labeled data. While pre-training on natural images is common, it results in a significant domain shift, limiting performance. Self-Supervised Learning (SSL) on unlabeled medical data has emerged as a powerful solution, but prominent frameworks often fail to exploit the inherent 3D nature of CT scans. These methods typically process 3D scans as a collection of independent 2D slices, an approach that fundamentally discards critical axial coherence and the 3D structural context. To address this limitation, we propose the autoencoder for enhanced self-supervised medical image learning(MAESIL), a novel self-supervised learning framework designed to capture 3D structural information efficiently. The core innovation is the 'superpatch', a 3D chunk-based input unit that balances 3D context preservation with computational efficiency. Our framework partitions the volume into superpatches and employs a 3D masked autoencoder strategy with a dual-masking strategy to learn comprehensive spatial representations. We validated our approach on three diverse large-scale public CT datasets. Our experimental results show that MAESIL demonstrates significant improvements over existing methods such as AE, VAE and VQ-VAE in key reconstruction metrics such as PSNR and SSIM. This establishes MAESIL as a robust and practical pre-training solution for 3D medical imaging tasks.

Black Hat Asia

AI Business

Unitree's IPO

ChinaTalk

Did you know your GIGABYTE laptop has a built-in AI coding assistant? Meet GiMATE Coder 🤖

Dev.to

Benchmarking Batch Deep Reinforcement Learning Algorithms

Dev.to

A bug in Bun may have been the root cause of the Claude Code source code leak.

Reddit r/LocalLLaMA

MAESIL: Masked Autoencoder for Enhanced Self-supervised Medical Image Learning

Key Points

Abstract

Related Articles

Black Hat Asia

Unitree's IPO

Did you know your GIGABYTE laptop has a built-in AI coding assistant? Meet GiMATE Coder 🤖

Benchmarking Batch Deep Reinforcement Learning Algorithms

A bug in Bun may have been the root cause of the Claude Code source code leak.

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer