Neuro-Oracle: A Trajectory-Aware Agentic RAG Framework for Interpretable Epilepsy Surgical Prognosis

arXiv cs.LG / 4/17/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces Neuro-Oracle, a three-stage agentic RAG framework to improve prediction of post-surgical seizure outcomes in pharmacoresistant epilepsy by leveraging longitudinal MRI changes rather than single pre-operative scans.
  • Neuro-Oracle first distills pre-to-post-operative MRI differences into a compact 512-dimensional trajectory vector using a 3D Siamese contrastive encoder, then retrieves similar historical surgical trajectories from a population archive via nearest-neighbor search.
  • The framework then generates a natural-language prognosis using a quantized Llama-3-8B reasoning agent grounded in the retrieved evidence, aiming for interpretability rather than purely predictive scoring.
  • Experiments on the EPISURG dataset (N=268 longitudinally paired cases) with five-fold stratified cross-validation show trajectory-based methods achieve AUCs of 0.834–0.905, outperforming a single-timepoint ResNet-50 baseline (AUC 0.793).
  • The language-enabled Neuro-Oracle agent achieves AUC comparable to discriminative trajectory classifiers (AUC 0.867) while producing structured justifications with zero observed hallucinations in the authors’ audit, though the authors note the need for further validation given proxy labeling and potential confounding representation learning.

Abstract

Predicting post-surgical seizure outcomes in pharmacoresistant epilepsy is a clinical challenge. Conventional deep-learning approaches operate on static, single-timepoint pre-operative scans, omitting longitudinal morphological changes. We propose \emph{Neuro-Oracle}, a three-stage framework that: (i) distils pre-to-post-operative MRI changes into a compact 512-dimensional trajectory vector using a 3D Siamese contrastive encoder; (ii) retrieves historically similar surgical trajectories from a population archive via nearest-neighbour search; and (iii) synthesises a natural-language prognosis grounded in the retrieved evidence using a quantized Llama-3-8B reasoning agent. Evaluations are conducted on the public EPISURG dataset (N{=}268 longitudinally paired cases) using five-fold stratified cross-validation. Since ground-truth seizure-freedom scores are unavailable, we utilize a clinical proxy label based on the resection type. We acknowledge that the network representations may potentially learn the anatomical features of the resection cavities (i.e., temporal versus non-temporal locations) rather than true prognostic morphometry. Our current evaluation thus serves mainly as a proof-of-concept for the trajectory-aware retrieval architecture. Trajectory-based classifiers achieve AUC values between 0.834 and 0.905, compared with 0.793 for a single-timepoint ResNet-50 baseline. The Neuro-Oracle agent (M5) matches the AUC of purely discriminative trajectory classifiers (0.867) while producing structured justifications with zero observed hallucinations under our audit protocol. A Siamese Diversity Ensemble (M6) of trajectory-space classifiers attains an AUC of 0.905 without language-model overhead.