AURORA: Adaptive Unified Representation for Robust Ultrasound Analysis

arXiv cs.CV / 3/23/2026

📰 NewsModels & Research

Key Points

  • Proposes AURORA, a unified multi-task framework using a transformer visual encoder (Qwen3-VL) to handle segmentation, detection, classification, and landmark regression across diverse ultrasound data.
  • It projects intermediate token features into spatial feature maps and fuses them with a lightweight multi-scale feature pyramid to enable both pixel-level predictions and global reasoning in a shared representation.
  • Each task uses a small task-specific prediction head with task-aware sampling and selective loss balancing to manage heterogeneous supervision and reduce task imbalance.
  • The method aims for simple optimization and broad adaptability, reporting validation performance improvements from 67% to 85% and an average test score of 81.84% across all FMC-UIA tasks.
  • Code for FMC-UIA-ISBI is openly available at the provided GitHub link.

Abstract

Ultrasound images vary widely across scanners, operators, and anatomical targets, which often causes models trained in one setting to generalize poorly to new hospitals and clinical conditions. The Foundation Model Challenge for Ultrasound Image Analysis (FMC-UIA) reflects this difficulty by requiring a single model to handle multiple tasks, including segmentation, detection, classification, and landmark regression across diverse organs and datasets. We propose a unified multi-task framework based on a transformer visual encoder from the Qwen3-VL family. Intermediate token features are projected into spatial feature maps and fused using a lightweight multi-scale feature pyramid, enabling both pixel-level predictions and global reasoning within a shared representation. Each task is handled by a small task-specific prediction head, while training uses task-aware sampling and selective loss balancing to manage heterogeneous supervision and reduce task imbalance. Our method is designed to be simple to optimize and adaptable across a wide range of ultrasound analysis tasks. The performance improved from 67% to 85% on the validation set and achieved an average score of 81.84% on the official test set across all tasks. The code is publicly available at: https://github.com/saitejalekkala33/FMCUIA-ISBI.git