Evaluating adaptive and generative AI-based feedback and recommendations in a knowledge-graph-integrated programming learning system

arXiv cs.AI / 3/27/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper presents a framework that combines an LLM with retrieval-augmented generation using both a knowledge graph and learners’ interaction history to provide formative code feedback and exercise recommendations.
  • The framework is embedded in an existing adaptive programming learning system and evaluated across three instructional modes: adaptive-only, GenAI-only, and a hybrid GenAI-adaptive approach.
  • Using data from 4 log features derived from 4,956 code submissions, results show GenAI-based modes produce significantly more correct code and fewer submissions missing essential programming logic than adaptive-only feedback.
  • The hybrid GenAI-adaptive mode performs best overall, delivering the highest number of correct submissions and the fewest incorrect or incomplete attempts compared with either single-mode approach.
  • Survey results indicate learners generally find GenAI-generated feedback helpful, and all modes are rated positively for perceived ease of use and usefulness.

Abstract

This paper introduces the design and development of a framework that integrates a large language model (LLM) with a retrieval-augmented generation (RAG) approach leveraging both a knowledge graph and user interaction history. The framework is incorporated into a previously developed adaptive learning support system to assess learners' code, generate formative feedback, and recommend exercises. Moerover, this study examines learner preferences across three instructional modes; adaptive, Generative AI (GenAI), and hybrid GenAI-adaptive. An experimental study was conducted to compare the learning performance and perception of the learners, and the effectiveness of these three modes using four key log features derived from 4956 code submissions across all experimental groups. The analysis results show that learners receiving feedback from GenAI modes had significantly more correct code and fewer code submissions missing essential programming logic than those receiving feedback from adaptive mode. In particular, the hybrid GenAI-adaptive mode achieved the highest number of correct submissions and the fewest incorrect or incomplete attempts, outperforming both the adaptive-only and GenAI-only modes. Questionnaire responses further indicated that GenAI-generated feedback was widely perceived as helpful, while all modes were rated positively for ease of use and usefulness. These results suggest that the hybrid GenAI-adaptive mode outperforms the other two modes across all measured log features.
広告