NucEval: A Robust Evaluation Framework for Nuclear Instance Segmentation

arXiv cs.CV / 5/6/2026

📰 NewsDeveloper Stack & InfrastructureSignals & Early TrendsModels & Research

共有:

Key Points

The paper introduces NucEval, a unified evaluation framework aimed at improving how nuclear instance segmentation is assessed in computational pathology.
It identifies four often-underappreciated evaluation-pipeline problems—vague regions, score normalization, overlapping instances, and border uncertainty—and provides specific fixes for each.
NucEval is tested on the NuInsSeg dataset plus two external datasets, using both CNN- and ViT-based segmentation models to show how the proposed changes affect instance segmentation metrics.
The authors make the code, guidelines, and example usage publicly available to support robust and reproducible evaluation across studies.
Overall, the work argues that evaluation methodology can substantially change reported performance for nuclear instance segmentation systems, not just the models themselves.

Abstract

In computational pathology, nuclear instance segmentation is a fundamental task with many downstream clinical applications. With the advent of deep learning, many approaches, including convolutional neural networks (CNNs) and vision transformers (ViTs), have been proposed for this task, along with both machine learning-based and non-machine learning-based pre- and post-processing techniques to further boost performance. However, one fundamental aspect that has received less attention is the evaluation pipeline. In this study, we identify four key issues associated with nuclear instance segmentation evaluation and propose corresponding solutions. Our proposed modifications, namely handling vague regions, score normalization, overlapping instances, and border uncertainty, are integrated into a unified framework called NucEval, which enables robust evaluation of nuclear instance segmentation. We evaluate this pipeline using the NuInsSeg dataset, which provides unique characteristics that make it particularly suitable for this study, as well as two additional external datasets, with three CNN- and ViT-based nuclear instance segmentation models, to demonstrate the impact of these modifications on instance segmentation metrics. The code, along with complete guidelines and illustrative examples, is publicly available at: https://github.com/masih4/nuc_eval.

Vibe coding and agentic engineering are getting closer than I'd like

Simon Willison's Blog

Enterprise Low-Code Intelligence | Azure AI x Power Platform | R.A.H.S.I. Framework™

Dev.to

AI Harness Engineering: The Missing Layer Behind Reliable LLM Applications

Dev.to

Qwen3.6 27B NVFP4 + MTP on a single RTX 5090: 200k context working in vLLM

Reddit r/LocalLLaMA

AI boom pushes Samsung to $1T

TechCrunch

NucEval: A Robust Evaluation Framework for Nuclear Instance Segmentation

Key Points

Abstract

Related Articles

Vibe coding and agentic engineering are getting closer than I'd like

Enterprise Low-Code Intelligence | Azure AI x Power Platform | R.A.H.S.I. Framework™

AI Harness Engineering: The Missing Layer Behind Reliable LLM Applications

Qwen3.6 27B NVFP4 + MTP on a single RTX 5090: 200k context working in vLLM

AI boom pushes Samsung to $1T

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer