MERIT: Memory-Enhanced Retrieval for Interpretable Knowledge Tracing

arXiv cs.AI / 3/25/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces MERIT, a training-free Knowledge Tracing framework that aims to improve interpretability while maintaining strong predictive accuracy for student performance modeling.
  • Instead of fine-tuning an LLM, MERIT uses a frozen LLM for reasoning and builds an interpretable “memory bank” by transforming interaction logs into latent cognitive schemas and a paradigm bank of representative error patterns.
  • It applies semantic denoising to cluster students by cognitive schemas and analyzes error patterns offline to produce explicit Chain-of-Thought rationales for better transparency.
  • During inference, MERIT uses hierarchical routing to retrieve relevant contextual information and a logic-augmented module with semantic constraints to calibrate predictions.
  • The authors report state-of-the-art results on real-world datasets while reducing computational cost and enabling dynamic knowledge updates without gradient updates.

Abstract

Knowledge Tracing (KT) models students' evolving knowledge states to predict future performance, serving as a foundation for personalized education. While traditional deep learning models achieve high accuracy, they often lack interpretability. Large Language Models (LLMs) offer strong reasoning capabilities but struggle with limited context windows and hallucinations. Furthermore, existing LLM-based methods typically require expensive fine-tuning, limiting scalability and adaptability to new data. We propose MERIT (Memory-Enhanced Retrieval for Interpretable Knowledge Tracing), a training-free framework combining frozen LLM reasoning with structured pedagogical memory. Rather than updating parameters, MERIT transforms raw interaction logs into an interpretable memory bank. The framework uses semantic denoising to categorize students into latent cognitive schemas and constructs a paradigm bank where representative error patterns are analyzed offline to generate explicit Chain-of-Thought (CoT) rationales. During inference, a hierarchical routing mechanism retrieves relevant contexts, while a logic-augmented module applies semantic constraints to calibrate predictions. By grounding the LLM in interpretable memory, MERIT achieves state-of-the-art performance on real-world datasets without gradient updates. This approach reduces computational costs and supports dynamic knowledge updates, improving the accessibility and transparency of educational diagnosis.