Extracting Breast Cancer Phenotypes from Clinical Notes: Comparing LLMs with Classical Ontology Methods
arXiv cs.CL / 4/9/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The study presents an LLM-based framework for extracting structured breast cancer phenotypes (e.g., treatment outcomes, biomarkers, tumor location, size, and growth patterns) from unstructured oncology clinical notes in EMRs.
- It evaluates the LLM approach against earlier ontology/knowledge-driven methods that use the NCIt Ontology Annotator for annotation.
- Results indicate the LLM information-extraction framework can achieve accuracy comparable to classical ontology-based methods while leveraging natural-language notes.
- The authors argue the trained framework is adaptable and can be fine-tuned to cover other cancer types and diseases beyond breast cancer.
Related Articles

Black Hat Asia
AI Business

Amazon CEO takes aim at Nvidia, Intel, Starlink, more in annual shareholder letter
TechCrunch

Why Anthropic’s new model has cybersecurity experts rattled
Reddit r/artificial
Does the AI 2027 paper still hold any legitimacy?
Reddit r/artificial

Why Most Productivity Systems Fail (And What to Do Instead)
Dev.to