UCell: rethinking generalizability and scaling of bio-medical vision models

arXiv cs.CV / 4/2/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper argues that in biomedical vision tasks, scaling can be limited by scarce, costly training data, so focusing on smaller, more generalizable models may be more effective than building ever-larger foundation models.
  • It introduces UCell, a 10–30M parameter biomedical segmentation model that uses a recursive forward computation structure to improve parameter efficiency.
  • Experiments on multiple benchmarks show UCell matches the performance of models 10–20× larger for single-cell segmentation while maintaining similar out-of-domain generalizability.
  • The authors report that UCell can be trained from scratch using only microscopy imaging data, avoiding reliance on large-scale pretraining on natural images.
  • They further validate adaptability through extensive one-shot and few-shot fine-tuning experiments across many small datasets and provide implementation on GitHub.

Abstract

The modern deep learning field is a scale-centric one. Larger models have been shown to consistently perform better than smaller models of similar architecture. In many sub-domains of biomedical research, however, the model scaling is bottlenecked by the amount of available training data, and the high cost associated with generating and validating additional high quality data. Despite the practical hurdle, the majority of the ongoing research still focuses on building bigger foundation models, whereas the alternative of improving the ability of small models has been under-explored. Here we experiment with building models with 10-30M parameters, tiny by modern standards, to perform the single-cell segmentation task. An important design choice is the incorporation of a recursive structure into the model's forward computation graph, leading to a more parameter-efficient architecture. We found that for the single-cell segmentation, on multiple benchmarks, our small model, UCell, matches the performance of models 10-20 times its size, and with a similar generalizability to unseen out-of-domain data. More importantly, we found that ucell can be trained from scratch using only a set of microscopy imaging data, without relying on massive pretraining on natural images, and therefore decouples the model building from any external commercial interests. Finally, we examined and confirmed the adaptability of ucell by performing a wide range of one-shot and few-shot fine tuning experiments on a diverse set of small datasets. Implementation is available at https://github.com/jiyuuchc/ucell