This paper introduces a versatile, user-friendly application for object detection and counting in video streams using YOLO (You Only Look Once) architecture. Built on Streamlit, the application enables users to upload videos, define custom regions of interest, and obtain precise object counts with minimal technical expertise. This solution addresses the growing need for accessible video analytics tools across various domains including retail traffic analysis, transportation monitoring, and security surveillance.
Object counting from video is a fundamental task in computer vision with applications spanning multiple industries. Traditional methods often require extensive programming knowledge or specialized hardware, creating barriers to adoption. Our application bridges this gap by providing an intuitive interface for object detection and counting powered by state-of-the-art YOLO models.
The integration of YOLO's real-time detection capabilities with Streamlit's interactive web framework creates a powerful yet accessible solution for video analytics. Users can process videos, visualize results, and extract valuable insights without writing a single line of code.
The application is built on three primary technical components:
YOLO Detection Engine: Utilizes the Ultralytics implementation of YOLO for accurate, real-time object detection with pre-trained models.
OpenCV Processing Pipeline: Handles video frame extraction, manipulation, and region-based analytics.
Streamlit Frontend: Provides the user interface components, including file uploading, parameter selection, and results visualization.
The application follows a streamlined workflow:
The system supports three region types for counting:
During video processing, the application provides:
The processed output includes:
The application requires:
project/
├── app.py # Main Streamlit application
├── requirements.txt # Dependencies
└── outputs/ # Auto-generated directory for results
The application can be deployed locally using:
pip install opencv-python-headless streamlit ultralytics streamlit run app.py
For cloud deployment, the application can be hosted on Streamlit Cloud, Heroku, or any platform supporting Python web applications.
YOLO models provide state-of-the-art detection performance with:
Processing speed depends on several factors:
On moderate hardware (quad-core CPU), the application achieves:
Current limitations include:
Planned enhancements include:
The YOLO-based object counting application demonstrates how modern deep learning techniques can be made accessible through intuitive user interfaces. By combining the power of YOLO detection with Streamlit's interactive capabilities, we've created a versatile tool that enables users across various domains to extract valuable insights from video data without specialized technical knowledge.
The open-source nature of the project encourages community contributions and adaptations for specific use cases, furthering the democratization of computer vision technologies.
Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 779-788).
Ultralytics. (2023). YOLOv8: State-of-the-art YOLO object detection. https://github.com/ultralytics/ultralytics
Streamlit. (2023). Streamlit: The fastest way to build and share data apps. https://streamlit.io/
Bradski, G. (2000). The OpenCV Library. Dr. Dobb's Journal of Software Tools.
def process_video(video_file, region_type, progress_bar): # Initialize YOLO model model = YOLO("yolov8n.pt") # Open video file cap = cv2.VideoCapture(video_file) total_frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT)) fps = cap.get(cv2.CAP_PROP_FPS) # Initialize output video writer output_path = f"outputs/processed_{os.path.basename(video_file)}" fourcc = cv2.VideoWriter_fourcc(*'mp4v') width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH)) height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT)) out = cv2.VideoWriter(output_path, fourcc, fps, (width, height)) # Initialize counter based on region type if region_type == "Line": counter = LineCounter(start_point=(100, height//2), end_point=(width-100, height//2)) elif region_type == "Rectangle": counter = RectangleCounter(top_left=(width//4, height//4), bottom_right=(width*3//4, height*3//4)) else: # Polygon # Define default polygon points points = [(width//4, height//4), (width*3//4, height//4), (width*3//4, height*3//4), (width//4, height*3//4)] counter = PolygonCounter(points) # Process frames frame_count = 0 while cap.isOpened(): ret, frame = cap.read() if not ret: break # Detect objects in frame results = model(frame, stream=True) # Process detections detections = [] for r in results: boxes = r.boxes for box in boxes: x1, y1, x2, y2 = box.xyxy[0] conf = box.conf[0] cls = int(box.cls[0]) # Filter detections (e.g., only persons, cars, etc.) if cls in [0, 2, 3, 5, 7]: # person, car, motorcycle, bus, truck detections.append({ "bbox": (int(x1), int(y1), int(x2), int(y2)), "conf": float(conf), "cls": cls }) # Update counter with new detections counter.update(detections) # Draw region and counters on frame counter.draw(frame) # Write frame to output video out.write(frame) # Update progress frame_count += 1 progress_bar.progress(frame_count / total_frames) # Release resources cap.release() out.release() return output_path
import streamlit as st import cv2 import os from ultralytics import YOLO from counters import LineCounter, RectangleCounter, PolygonCounter def main(): st.title("Object Counting with YOLO") # Sidebar for file upload and options st.sidebar.header("Settings") uploaded_file = st.sidebar.file_uploader("Upload video", type=["mp4", "avi", "mov"]) region_type = st.sidebar.selectbox( "Select counting region type", ["Line", "Rectangle", "Polygon"] ) if uploaded_file is not None: # Save uploaded file with open("temp_video.mp4", "wb") as f: f.write(uploaded_file.getbuffer()) # Process button if st.sidebar.button("Process Video"): # Create progress bar progress_bar = st.progress(0) # Process video output_path = process_video("temp_video.mp4", region_type, progress_bar) # Display results st.success("Processing complete!") # Video playback st.header("Processed Video") st.video(output_path) # Download button with open(output_path, "rb") as file: st.download_button( label="Download processed video", data=file, file_name=os.path.basename(output_path), mime="video/mp4" ) else: st.info("Please upload a video file to begin.") if __name__ == "__main__": # Create output directory if it doesn't exist os.makedirs("outputs", exist_ok=True) main()
Note: This publication describes an open-source project available at Object Counting with YOLO. The implementation is provided under the MIT License and welcomes community contributions.
