This project presents a real-time object detection and collision alert system built with Python, Streamlit, OpenCV, and YOLOv8. It is designed to detect objects from a live video feed, measure proximity between specified pairs, and trigger custom alerts when a defined threshold is breached. Initially conceptualized to deter a curious cat from damaging a new TV, the system demonstrates flexibility and potential for broader applications in surveillance, automation, and safety monitoring.
Object detection systems have gained immense popularity and importance in recent years, thanks to advancements in computer vision, machine learning, and real-time processing technologies. From security systems to autonomous vehicles, the ability to detect, track, and react to objects in a video stream has vast applications across many industries. This project focuses on a real-time object detection system that also incorporates a novel feature: proximity-based collision alerts.
The initial inspiration for this project stemmed from a personal need. After adopting a cat named Kaladin, I noticed that his natural curiosity often led him to explore the cables behind the television, which posed a risk to both the hardware and the environment. Recognizing the need to address this issue, I conceived a system that would automatically detect his movements and alert me when he ventured too close to the television, thus avoiding potential damage.
The system’s core functionality was designed to detect objects—specifically the cat—and determine if certain objects in its environment, such as the TV or furniture, came within a dangerous proximity. The initial use case was simple: when Kaladin approached the TV, an alert would be triggered to play the sound of a door opening through Alexa, encouraging him to move away from the area. This basic setup led to the development of a broader concept: a flexible, real-time object detection and collision alert system that could be used for various practical applications, from home automation to industrial safety.
At its current stage, this project is still in its early phases of development. The main focus thus far has been on the fundamental logic of real-time object detection, collision detection based on proximity between objects, and triggering alerts based on user-defined thresholds. The system leverages cutting-edge technologies, including YOLOv8 for object detection, OpenCV for video processing, and Streamlit for real-time interface and user interaction. While the system performs well in its current form, significant improvements are planned in terms of user experience, performance optimization, and scalability.
As with many computer vision systems, the accuracy and performance of real-time object detection systems depend heavily on the computational resources available. The goal of this project is to create a system that balances computational efficiency with accuracy, enabling it to function effectively in various environments without placing too much demand on the hardware. This balance is particularly important as real-time systems must process a continuous stream of video data, detect objects quickly, and calculate distances between objects—all while minimizing latency to ensure timely alerts.
Another significant aspect of this project is its flexibility. Initially designed for a specific use case involving a cat and a TV, the underlying technology can easily be adapted to other scenarios. Whether used for surveillance, industrial monitoring, or autonomous systems, the ability to track multiple objects and trigger alerts based on proximity provides a wide range of potential applications. The collision alert system, in particular, is of interest for any scenario where objects or people need to be monitored for potential collisions, whether in a home, workplace, or public space.
The broader goal of this project is to demonstrate that real-time object detection can be an accessible, flexible, and scalable solution for everyday problems, while also offering potential for more complex and high-demand applications. Although still in the early stages, this system lays the groundwork for more refined and user-friendly implementations in the future. By integrating real-time video processing, machine learning-based object detection, and proximity alerts, this project represents a promising approach to making environments smarter, safer, and more responsive to the objects within them.
The foundation of this project revolves around a real-time video processing pipeline:
Video Input: A video stream is captured via an IP-connected camera. The input frames can be adjusted to balance detection accuracy and system performance.
Frame Processing and YOLO Integration: Each frame undergoes object detection using YOLOv8, with bounding boxes drawn for identified objects. Detections across consecutive frames are linked, ensuring consistent tracking. Intermittent detections are smoothed out to maintain visual continuity.
Filtering: Specific objects are filtered based on user-defined criteria to reduce visual clutter and improve system responsiveness.
Proximity Detection: Pairs of objects are monitored for proximity using Euclidean distance calculations between the bounding box centers. Only relevant object pairs are considered, reducing false positives.
Alert Generation: When object pairs breach the defined distance threshold, alerts are triggered. Initially connected to an Alexa workflow, the system allows for flexible alert mechanisms to suit various use cases.
This system is implemented entirely in Python, leveraging the following key technologies:
Tests involved varying frame rates, detection rates, and proximity thresholds to assess system performance under different configurations.
Overall, the system performs effectively, though it requires optimization for dynamic adjustments and better user interaction.
The real-time object detection and collision alert system developed in this project represents an important first step toward more sophisticated applications in object monitoring, particularly in domestic environments. While the system is functional and provides accurate alerts with minimal latency, several aspects require further refinement to optimize its performance and usability.
The processing speed and latency depend heavily on hardware performance. In our tests, using a powerful laptop (ASUS TUF Dash F15 with RTX 3060 and Intel i7-11th Gen), the system was able to process video in real time, with a latency consistently under 1 second. However, the latency increased when the system was overloaded, especially with high frame rates and many objects being tracked simultaneously. This issue was mitigated using frame buffering, though this results in delays when the buffer accumulates, particularly under heavy workloads.
YOLOv8 provided robust object detection, but some inaccuracies occurred due to "popping" or flickering of bounding boxes during intermittent object detections. This is a common issue in real-time object detection systems and can be addressed by fine-tuning the detection threshold or integrating more advanced tracking algorithms to maintain object consistency across frames. Further, the accuracy of the proximity detection and collision alert functionality has been proven to work well under normal circumstances. The ability to dynamically adjust the collision threshold is an important feature, allowing flexibility for various use cases.
While the system is functional, the user interface (UI) is currently basic and focused more on providing the underlying logic. Future work will need to improve the UI to allow for easier configuration of parameters such as the object filtering, collision pairs, and distance thresholds. Making these settings dynamic will allow users to modify configurations in real time, making the system more user-friendly.
While the initial motivation for this project came from a personal need to monitor a cat’s behavior, the underlying technology has broader applications in surveillance, safety, and automation. For example, it could be used in industrial environments to monitor machinery for proximity alerts or in home security systems for detecting suspicious movement or object collisions.
Moving forward, there are several potential areas for improvement:
In conclusion, this project offers an exciting start in the field of real-time object detection and alert systems, and its continued development promises to lead to a fully functional, adaptable solution for both home and industrial applications.
This project is a promising start towards a flexible and efficient real-time object detection system with proximity alerts. However, it is still in its early stages, with significant room for improvement in terms of user interface, dynamic configurations, and performance tuning. Future iterations will focus on refining detection stability, enhancing usability, and expanding the system’s adaptability to different environments.
Link text
Streamlit Documentation: Streamlit Official Documentation
OpenCV Documentation: OpenCV Official Documentation
YOLO Documentation: Ultralytics YOLOv8 Documentation
Docker Documentation: Docker
DroidCam: DroidCam App
This project was made possible only thanks to the accessibility and implementation ease of open-source video processing and recognition systems. However, I would like to give a special thanks to my cat, Kaladin (Kal to his friends), whose curiosity about the cables behind the TV pushed me to create this real-time detection system.
At this stage, environment variables are not necessary. The system runs using a single script containing all functionality, launched by default via Docker Compose.
Upcoming improvements will include structured environment variable support for easier customization, dynamic configuration through the interface, and modularization of the codebase for better maintainability.
Currently, the following variables are planned for future integration:
Variable | Description | Default Value |
---|---|---|
VIDEO_URL | URL of the video stream from the webcam/IP camera | None |
COLLISION_THRESHOLD | Distance threshold for collision alerts (in px) | 200 |
FRAME_RATE | Frame processing rate (frames per second) | 30 |
DETECTION_RATE | Object detection frequency (detections per second) | 4 |
The single script handles all core functionality and is automatically launched via Docker Compose, ensuring simplicity for current testing and development.
Clone the repository:
git clone <repository-url> cd <repository-directory>
docker-compose up --build
Access the interface: http://localhost:8501.
This appendix includes configuration details to facilitate reproducibility and ease of deployment.
There are no datasets linked
There are no datasets linked