A dataset of medication images with instance segmentation masks for preventing adverse drug events

arXiv cs.CV / 3/12/2026

📰 NewsTools & Practical UsageModels & Research

共有:

Key Points

MEDISEG provides instance segmentation annotations for 32 pill types across 8262 images, addressing real-world complexities like overlapping pills and occlusions.
The dataset enables training and benchmarking AI models (YOLOv8 and YOLOv9) and achieves high mean average precision at IoU 0.5 of 99.5% on the 3-Pills subset and 80.1% on the 32-Pills subset.
Few-shot detection experiments show base training on MEDISEG significantly improves recognition of unseen pill classes in occluded multi-pill scenarios compared to existing datasets.
The dataset is a valuable resource for developing and benchmarking AI-driven systems for medication safety and promoting transferable representations under limited supervision.

Abstract

Medication errors and adverse drug events (ADEs) pose significant risks to patient safety, often arising from difficulties in reliably identifying pharmaceuticals in real-world settings. AI-based pill recognition models offer a promising solution, but the lack of comprehensive datasets hinders their development. Existing pill image datasets rarely capture real-world complexities such as overlapping pills, varied lighting, and occlusions. MEDISEG addresses this gap by providing instance segmentation annotations for 32 distinct pill types across 8262 images, encompassing diverse conditions from individual pill images to cluttered dosette boxes. We trained YOLOv8 and YOLOv9 on MEDISEG to demonstrate their usability, achieving mean average precision at IoU 0.5 of 99.5 percent on the 3-Pills subset and 80.1 percent on the 32-Pills subset. We further evaluate MEDISEG under a few-shot detection protocol, demonstrating that base training on MEDISEG significantly improves recognition of unseen pill classes in occluded multi-pill scenarios compared to existing datasets. These results highlight the dataset's ability not only to support robust supervised training but also to promote transferable representations under limited supervision, making it a valuable resource for developing and benchmarking AI-driven systems for medication safety.