Should There be a Teacher In-the-Loop? A Study of Generative AI Personalized Tasks Middle School

arXiv cs.AI / 4/15/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The study investigates whether using generative AI (ChatGPT) with a teacher-in-the-loop can efficiently produce personalized middle-school math tasks aligned to students’ interests.
  • Seven teachers collaborated with ChatGPT to generate personalized curriculum problems, and the researchers analyzed teachers’ prompting strategies, creation efficiency, and students’ responses.
  • Teacher-in-the-loop workflows enabled generative AI-enhanced personalization at a relatively broad “grain size,” but students generally preferred finer-grained personalization that included specific popular-culture references.
  • Teachers devoted substantial effort to adjusting pop-culture references and fixing issues such as insufficient depth or realism in AI-generated problems, affecting how strongly teachers felt ownership of the content.
  • While teachers improved over time in crafting engaging problems with AI, the process did not become clearly more time-efficient despite iterative refinement based on student data.

Abstract

Adapting instruction to the fine-grained needs of individual students is a powerful application of recent advances in large language models. These generative AI models can create tasks that correspond to students' interests and enact context personalization, enhancing students' interest in learning academic content. However, when there is a teacher in-the-loop creating or modifying tasks with generative AI, it is unclear how efficient this process might be, despite commercial generative AI tools' claims that they will save teachers time. In the present study, we teamed 7 middle school mathematics teachers with ChatGPT to create personalized versions of problems in their curriculum, to correspond to their students' interests. We look at the prompting moves teachers made, their efficiency when creating problems, and the reactions of their 521 7th grade students who received the personalized assignments. We find that having a teacher-in-the-loop results in generative AI-enhanced personalization being enacted at a relatively broad grain size, whereas students tend to prefer a smaller grain size where they receive specific popular culture references that interest them. Teachers spent a lot of effort adjusting popular culture references and addressing issues with the depth or realism of the problems generated, giving higher or lower levels of ownership to the generative AI. Teachers were able to improve in their ability to craft interesting problems in partnership with generative AI, but this process did not appear to become particularly time efficient as teachers learned and reflected on their students' data, iterating their approaches.