Abstract
As online education continues to grow, maintaining academic integrity in remote assessments has become a significant challenge. To address this issue, we propose a Real-Time Video Proctoring System powered by Computer Vision . This system leverages cutting-edge object detection, gaze tracking, and emotion analysis to ensure the integrity of online exams by detecting cheating behaviors in real-time.
In this publication, we demonstrate how our solution integrates multi-modal AI techniques, including face detection, emotion recognition, and mobile phone detection, to monitor students during exams. The solution is designed to be scalable, efficient, and accurate, offering a robust solution for exam integrity in virtual environments
Keywords
Computer Vision, Video Proctoring, Gaze Detection, Mobile Detection, Multi-Person Detection, AI-based Proctoring, Real-time Detection
Cheating in online assessments poses a significant threat to educational integrity. Current solutions often rely on static monitoring tools, which are prone to human error and can be circumvented. Our system aims to provide a dynamic and real-time monitoring solution, utilizing AI-driven techniques to detect suspicious behavior, such as:
The proctoring system leverages several AI models and techniques for real-time detection:
Face Detection: We use OpenCV's Haar Cascade classifier to detect faces in the video feed.
Gaze Tracking: Using the relative movement of the student's face, we determine whether the student is looking away from the screen.
Emotion Recognition: The FER (Facial Expression Recognition) library is used to analyze emotions from the detected faces, identifying signs of stress or dishonesty.
Mobile Phone Detection: A pre-trained object detection model (SSD with MobileNet) is employed to detect the presence of mobile phones in the student's environment.
Pre-trained Models
We use a pre-trained SSD MobileNet V2 model for object detection, capable of identifying various objects, including mobile phones.
The model was loaded using TensorFlow’s saved_model
API.
Face and Gaze Detection
The system utilizes OpenCV’s Haar Cascade Classifier for face detection. Gaze direction is tracked by computing the face’s center position across consecutive frames, and the gaze is classified as left, right, or forward.
Emotion Detection
The FER (Facial Expression Recognition) library was integrated to capture the candidate's emotional state during the exam. This adds an additional layer of analysis for identifying potential cheating.
Mobile Detection
The MobileNet model identifies bounding boxes around objects classified as mobile phones. If the object detected is classified as a cell phone (COCO class ID 77), the system flags the detection.
Real-time Video Processing
The system uses OpenCV to access the webcam in real-time, continuously processing frames and updating detection results. If suspicious behavior is detected (e.g., frequent gaze shifts, mobile usage), the system records these instances for further analysis.
References
OpenCV. (n.d.). Haar Cascade Classifier. Retrieved from https://opencv.org.
FER Library. (n.d.). Facial Expression Recognition. Retrieved from https://github.com/priya-dwivedi/fer.
TensorFlow. (n.d.). SSD MobileNet. Retrieved from https://www.tensorflow.org