Analysing Lightweight Large Language Models for Biomedical Named Entity Recognition on Diverse Ouput Formats

arXiv cs.AI / 4/30/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper analyzes how lightweight large language models (LLMs) perform on biomedical named entity recognition while reducing the computational and fine-tuning burden typical of larger models in healthcare.
  • It evaluates how different output formats affect model performance and finds that lightweight LLMs can reach competitive results versus larger counterparts.
  • The study reports that instruction tuning across many distinct formats does not improve performance, suggesting diminishing returns from broad-format instruction tuning.
  • It also identifies specific output formats that are consistently associated with better performance for biomedical information extraction tasks.
  • Overall, the findings support the use of lightweight, format-aware LLM approaches to meet privacy and budget constraints in medical settings.

Abstract

Despite their strong linguistic capabilities, Large Language Models (LLMs) are computationally demanding and require substantial resources for fine-tuning, which is unadapted to privacy and budget constraints of many healthcare settings. To address this, we present an experimental analysis focused on Biomedical Named Entity Recognition using lightweight LLMs, we evaluate the impact of different output formats on model performance. The results reveal that lightweight LLMs can achieve competitive performance compared to the larger models, highlighting their potential as lightweight yet effective alternatives for biomedical information extraction. Our analysis shows that instruction tuning over many distinct formats does not improve performance, but identifies several format consistently associated with better performance.