This publication presents the development of a real-time object detection application utilizing the YOLOv5 model. The application is designed to process video streams from files or webcams, providing users with a graphical interface to set detection parameters and visualize results. The project demonstrates the capabilities of deep learning in computer vision and its practical applications.
Object detection is a crucial task in computer vision, enabling machines to identify and locate objects within images or video streams. YOLO (You Only Look Once) is a state-of-the-art model known for its speed and accuracy in real-time object detection. This project aims to create an accessible application that leverages YOLOv5 to perform object detection in real-time, allowing users to interact with the model through a simple graphical user interface (GUI).
The application is built using Python and integrates several libraries:
Import Libraries and Initialize Model:
import cv2
import torch
import time
import tkinter as tk
from tkinter import filedialog, messagebox, simpledialog
model = torch.hub.load('ultralytics/yolov5', 'yolov5s') # Load YOLOv5 small model
model.eval()
#Ensure model works on GPU if available
device = 'cuda' if torch.cuda.is_available() else 'cpu'
model.to(device)
Preprocessing and Frame Handling:
def preprocess_frame(frame):
"""Convert frame from BGR to RGB as YOLO requires RGB images"""
rgb_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
return rgb_frame
Start Video Function:
def start_video(video_path=None, is_camera=False, resolution=(640,480): """Start processing video or camera feed"""
cap = cv2.VideoCapture(video_path) if video_path else cv2.VideoCapture(0)
if not cap.isOpened():
messagebox.showerror("Error", "Could not open video source.")
return
# Additional code for processing frames and saving output...
root = tk.Tk()
root.title("YOLO Object Detection")
confidence_var = tk.StringVar(value="0.5")
confidence_label = tk.Label(root,text="Confidence Threshold (0-1):")
confidence_label.pack()
confidence_entry = tk.Entry(root, textvariable=confidence_var)
confidence_entry.pack()
load_video_button = tk.Button(root, text="Load Video", command=load_video)
load_video_button.pack(pady=10)
use_camera_button = tk.Button(root, text="Use Camera", command=use_camera)
use_camera_button.pack(pady=10)
root.mainloop()
The application was tested using various video files and live camera feeds. The confidence threshold for object detection was adjustable via the GUI, allowing users to filter detections based on their requirements. The performance was monitored by calculating the frames per second (FPS) during the detection process.
The application successfully detected and labeled objects in real-time, displaying bounding boxes and confidence scores. Users could save the processed video with detected objects, enhancing the utility of the application for various use cases, such as surveillance and traffic monitoring.
Object Detection Example:
Object Tracking Example:
This project showcases the integration of YOLOv5 for real-time object detection in a user-friendly application. The combination of OpenCV, PyTorch, and Tkinter provides a robust framework for developing computer vision applications. Future work may involve optimizing the model for better performance and expanding the application’s capabilities to include additional features such as tracking and classification.
For further details, please refer to the
[YOLOv5 GitHub Repository]
[LinkedIn Post]
[Kaggle Notebook]