From Documents to Spans: Code-Centric Learning for LLM-based ICD Coding
arXiv cs.CL / 3/17/2026
📰 NewsIdeas & Deep AnalysisModels & Research
Key Points
- The paper proposes Code-Centric Learning for LLM-based ICD coding, shifting supervision from full clinical documents to short, scalable evidence spans to improve generalization to unseen ICD codes.
- It introduces a mixed training strategy and code-centric data expansion that reduces training cost while enhancing accuracy and interpretability.
- Span-level learning enables LLMs to perform document-level ICD coding efficiently, addressing the challenge of long clinical documents.
- The method outperforms strong baselines under the same LLM backbone and allows small-scale LLMs to match the performance of larger proprietary models.
- The approach preserves interpretability by attaching explicit evidence for assigned codes.
Related Articles
How AI is Transforming Dynamics 365 Business Central
Dev.to
Algorithmic Gaslighting: A Formal Legal Template to Fight AI Safety Pivots That Cause Psychological Harm
Reddit r/artificial
Do I need different approaches for different types of business information errors?
Dev.to
ShieldCortex: What We Learned Protecting AI Agent Memory
Dev.to
How AI-Powered Revenue Intelligence Transforms B2B Sales Teams
Dev.to