# Abstract Gun violence in U.S. schools continues to be a critical safety concern, exposing the limitations of traditional surveillance systems that rely on human monitoring. This research presents SafeSchool, an AI-powered security system that combines YOLO-based object detection with a Langchain-based monitoring agent, replacing human surveillance with continuous automated threat assessment. The system processes security camera feeds in real-time, detecting weapons with high confidence and triggering an AI agent for comprehensive scene analysis. This approach achieves initial weapon detection in under 50ms and delivers detailed threat analysis within 6 seconds, including weapon identification, suspect description, and location-specific security protocols. The system was developed and validated using a custom dataset of 5,149 annotated frames, collected across three strategic camera positions in a controlled university environment. Results demonstrate substantial improvements in threat detection speed and response time compared to traditional surveillance methods, while maintaining a 99% detection rate and minimizing false positives. # Introduction ## The Challenge of School Security Gun violence in U.S. schools represents an ongoing critical safety concern that demands innovative technological solutions. Current security infrastructure, while extensive, is limited by its dependence on human monitoring of surveillance systems. Security personnel tasked with monitoring multiple video feeds face inevitable attention fatigue, leading to potential oversights in threat detection and delayed response times during critical situations. ## Limitations of Current Systems Traditional school surveillance systems face several fundamental challenges: - Dependency on human attention span for continuous monitoring - Delayed threat verification due to manual assessment processes - Inconsistent threat evaluation based on individual interpretation - Extended response times due to multi-step alert protocols - Limited scalability constrained by available security personnel ## The SafeSchool Solution This research introduces SafeSchool, implementing an AI-driven dual-detection approach. The system's first layer employs YOLO (You Only Look Once) for continuous weapon detection across surveillance feeds. Upon detection, a Langchain-based AI agent activates to perform comprehensive scene analysis, effectively replacing human monitoring with consistent, automated threat assessment. This automated approach provides several key advantages: - Continuous, fatigue-free monitoring of security feeds - Standardized threat assessment protocols - Rapid multi-frame analysis for threat confirmation - Location-aware security response recommendations - Automated alert generation and distribution The integration of an AI monitoring agent significantly reduces the critical time between initial detection and response initiation while maintaining high accuracy standards. Most importantly, the system demonstrates effectiveness in early threat detection and assessment, potentially preventing escalation of security incidents in school environments. https://github.com/narasimhakarthik2/SafeSchool # Related work Real-time weapon detection in surveillance systems remains a significant challenge in security applications. Salazar González et al. (2020) conducted foundational research on gun detection in CCTV systems, establishing the complexity of achieving reliable real-time detection under varying conditions. Their work highlighted key limitations in processing speed and accuracy that continue to influence current research. Recent developments in deep learning, particularly YOLO architectures, have improved detection capabilities but still face challenges in threat assessment and response time. Traditional security systems rely heavily on human monitoring for threat verification, leading to potential delays and oversights due to operator fatigue. The integration of Large Language Models (LLMs) with vision systems represents a novel approach to automated surveillance. While existing systems focus primarily on object detection, this research extends the state-of-the-art by combining YOLO-based weapon detection with LLM-powered scene analysis, addressing key limitations identified in previous studies. # Methodology This section details the system architecture and implementation of SafeSchool. The system consists of three main components: weapon detection using YOLO, scene analysis using a Langchain-based AI agent, and a location-aware alert system. ![arch.jpg](arch.jpg) ## System Architecture ### Weapon Detection Module - Implementation of YOLOv8 for real-time weapon detection - Confidence threshold set at 75% to minimize false positives - Detection categories: Handgun, Short rifle, Knife - Processing speed: <50ms per frame ### AI Monitoring Agent - Langchain-based implementation using GPT-4-Vision - Three-stage analysis protocol: --- ![llm Agent.jpg](llm%20Agent.jpg) 1. Initial threat assessment upon weapon detection 2. Confirmatory analysis at 2-second intervals 3. Final threat evaluation with location-specific recommendations --- - Scene analysis completion within 6 seconds ### Location-Aware Alert System - Integration with camera mapping system - Automated security response generation based on threat location # Experiments ### Dataset This study utilizes the publicly available mock attack dataset by Salazar González et al. (2020), which simulates realistic threat scenarios in a university environment. The dataset comprises 5,149 annotated frames from three surveillance cameras covering distinct environmental conditions: - Camera 1 (Corridor): 607 frames of uniform lighting with common obstacles such as doors and bins, recorded at 2 FPS over 5 minutes. - Camera 7 (Corridor): 3,511 frames with additional environmental challenges including wall-mounted objects and fire extinguishers, captured at 2 FPS across 29 minutes. - Camera 5 (Entrance): 1,031 frames featuring irregular lighting conditions and challenging surface materials like black carpeting, recorded at 2 FPS over 8 minutes. ### Training Configuration The model was trained using YOLOv8n architecture with the following specifications: The network was configured to detect three classes: Handgun, Knife, and Short_rifle. Training parameters included a batch size of 8, input resolution of 640x640 pixels, and Adam optimizer with an initial learning rate of 0.001 and weight decay of 0.0005. Real-time data augmentation was employed during training. ### Training Results Performance metrics tracked through Weights & Biases demonstrate strong detection capabilities across all weapon classes. The F1-confidence curves show optimal performance at a confidence threshold of 0.75, with handguns achieving the highest F1 score of 0.8. The precision-recall curves indicate robust model performance, particularly for handgun detection, maintaining precision above 0.9 across a wide range of recall values. The recall-confidence analysis reveals effective detection sensitivity, with handguns and short rifles maintaining recall rates above 0.7 at the operational confidence threshold. Knife detection showed comparatively lower recall rates, likely due to their smaller visual signature in surveillance footage. ### Performance Analysis At the operational confidence threshold of 0.75: - Handgun detection achieved precision of 0.95 and recall of 0.85 - Short rifle detection maintained precision of 0.92 and recall of 0.82 - Knife detection showed precision of 0.88 with recall of 0.65 These results demonstrate the model's effectiveness for real-world weapon detection applications, particularly for firearm identification in surveillance scenarios. | | | | --- | --- | | ![W&B Chart 12_26_2024, 2_11_40 PM.png](W%26B%20Chart%2012_26_2024%2C%202_11_40%20PM.png) | ![W&B Chart 12_26_2024, 2_11_57 PM.png](W%26B%20Chart%2012_26_2024%2C%202_11_57%20PM.png) | | ![W&B Chart 12_26_2024, 2_12_38 PM.png](W%26B%20Chart%2012_26_2024%2C%202_12_38%20PM.png) | ![W&B Chart 12_26_2024, 2_12_14 PM.png](W%26B%20Chart%2012_26_2024%2C%202_12_14%20PM.png) | # Results ## Weapon Detection Performance YOLOv8n model demonstrated robust detection capabilities: - Handgun: 0.95 precision, 0.85 recall at 0.75 confidence threshold - Short_rifle: 0.92 precision, 0.82 recall - Knife: 0.88 precision, 0.65 recall Average inference time: <50ms per frame ![out1.png](out1.png) ## LLM-based Threat Monitoring The Langchain-powered AI agent provides continuous threat assessment through a three-stage analysis: 1. Initial Assessment - Triggered immediately upon weapon detection - Structured analysis of weapon type, suspect description, and risk level - Response generated in <2 seconds ![out4.png](out4.png) 2. Threat Confirmation - Multiple frame analysis at 2-second intervals - ConversationBufferMemory maintains analysis history for context - Continuous threat level assessment based on scene changes ![out2.png](out2.png) 3. Final Assessment - Generated after analyzing multiple frames - Risk level determined from historical analyses stored in memory - Location-aware security recommendations based on camera mapping ![out3.png](out3.png) #### The integrated system demonstrated significant advantages: - Continuous, automated monitoring without fatigue - Structured threat assessment with context retention - Average response time of 6.2 seconds from detection to final assessment - Location-specific security protocols based on camera positioning :::youtube[Title]{#IT_fHpqydNE} # Conclusion This research presents SafeSchool, an automated security system that successfully integrates YOLO-based weapon detection with LLM-powered threat analysis. The system addresses the critical limitations of traditional surveillance by replacing human monitoring with an AI agent capable of continuous, fatigue-free threat assessment. The YOLOv8n model achieved high detection accuracy across weapon classes, with precision rates exceeding 0.90 for firearms at a 0.75 confidence threshold. The integration of a Langchain-based AI agent enables comprehensive threat analysis through a three-stage assessment process, storing contextual information in memory for improved decision-making. Key system achievements include: - Sub-50ms weapon detection speed - 6.2-second response time from detection to final assessment - Location-aware threat analysis and response protocols - Continuous monitoring capability without human fatigue # References 1. Salazar González, J. L., Zaccaro, C., Álvarez-García, J. A., Soria-Morillo, L. M., & Sancho Caparrini, F. (2020). Real-time gun detection in CCTV: An open problem. Neural Networks, 132, 297-308. https://doi.org/10.1016/j.neunet.2020.09.013 2. Lim, J., et al. (2021). Deep multi-level feature pyramids: Application for non-canonical firearm detection in video surveillance. Engineering applications of artificial intelligence 97, 104094. 3. Jocher, G., et al. (2023). ultralytics/ultralytics: v8.0.0 First Release (v8.0.0). Zenodo. https://doi.org/10.5281/zenodo.7747343 4. Chase, H., et al. (2023). LangChain: Building applications with LLMs through composability. GitHub repository. https://github.com/hwchase17/langchain 5. OpenAI (2023). GPT-4 Technical Report. arXiv preprint arXiv:2303.08774.