Improving Clinical Trial Recruitment using Clinical Narratives and Large Language Models
arXiv cs.CL / 4/8/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The study evaluates using encoder- and decoder-based generative large language models to screen clinical narratives and reduce labor-intensive bottlenecks in clinical trial recruitment.
- It compares general-purpose LLMs vs medical-adapted LLMs and tests three long-document handling strategies: default long-context, NER-based extractive summarization, and RAG with dynamic retrieval based on eligibility criteria.
- Using the 2018 N2C2 Track 1 benchmark dataset, the MedGemma model combined with the RAG strategy achieved the highest micro-F1 score of 89.05%.
- Results suggest generative LLMs provide larger gains for eligibility criteria requiring long-term reasoning across extended documents, while improvements for short-context criteria (e.g., single lab tests) are more incremental.
- The paper concludes that real-world deployment should select between rule-based queries, encoder LLMs, and generative LLM approaches based on specific eligibility-criterion needs while keeping compute costs reasonable.
Related Articles

Black Hat Asia
AI Business
[N] Just found out that Milla Jovovich is a dev, invested in AI, and just open sourced a project
Reddit r/MachineLearning

ALTK‑Evolve: On‑the‑Job Learning for AI Agents
Hugging Face Blog

Context Windows Are Getting Absurd — And That's a Good Thing
Dev.to

Every AI Agent Registry in 2026, Compared
Dev.to