A Two-Stage, Object-Centric Deep Learning Framework for Robust Exam Cheating Detection

arXiv cs.CV / 4/20/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper proposes a two-stage, object-centric deep learning framework for exam cheating detection that combines student localization via YOLOv8n with per-region behavior classification using a fine-tuned RexNet-150 model.
  • Using a dataset compiled from 10 independent sources totaling 273,897 samples, the system reports strong performance (0.95 accuracy, 0.94 recall, 0.96 precision, and 0.95 F1-score), outperforming a video-based baseline accuracy of 0.82 by 13%.
  • The approach is designed for real-world deployment, claiming robustness and scalability with an average inference time of 13.9 ms per sample.
  • The authors emphasize ethical handling by delivering final outcomes privately to individual students (e.g., via personal email) to avoid public exposure or shaming.
  • They suggest future enhancements such as incorporating audio signals and consecutive video frames to further improve detection accuracy and reliability.

Abstract

Academic integrity continues to face the persistent challenge of examination cheating. Traditional invigilation relies on human observation, which is inefficient, costly, and prone to errors at scale. Although some existing AI-powered monitoring systems have been deployed and trusted, many lack transparency or require multi-layered architectures to achieve the desired performance. To overcome these challenges, we propose an improvement over a simple two-stage framework for exam cheating detection that integrates object detection and behavioral analysis using well-known technologies. First, the state-of-the-art YOLOv8n model is used to localize students in exam-room images. Each detected region is cropped and preprocessed, then classified by a fine-tuned RexNet-150 model as either normal or cheating behavior. The system is trained on a dataset compiled from 10 independent sources with a total of 273,897 samples, achieving 0.95 accuracy, 0.94 recall, 0.96 precision, and 0.95 F1-score - a 13\% increase over a baseline accuracy of 0.82 in video-based cheating detection. In addition, with an average inference time of 13.9 ms per sample, the proposed approach demonstrates robustness and scalability for deployment in large-scale environments. Beyond the technical contribution, the AI-assisted monitoring system also addresses ethical concerns by ensuring that final outcomes are delivered privately to individual students after the examination, for example, via personal email. This prevents public exposure or shaming and offers students an opportunity to reflect on their behavior. For further improvement, it is possible to incorporate additional factors, such as audio data and consecutive frames, to achieve greater accuracy. This study provides a foundation for developing real-time, scalable, ethical, and open-source solutions.