This publication presents a robust pedestrian tracking system capable of real-time performance. The system integrates state-of-the-art deep learning techniques and GPU acceleration to achieve efficient object detection and tracking. The YOLOv8 model is employed for object detection within video frames, identifying regions of interest that contain pedestrians. These regions are then processed using ResNet-18 for feature extraction, which encodes the distinct characteristics of each pedestrian. Subsequently, the Deep SORT algorithm is utilized to track multiple objects across frames. All technical implementations are developed in C++ and CUDA, with model inference optimized using TensorRT for high-speed computation. CUDA acceleration is also applied to both model preprocessing and postprocessing stages. The system achieves a processing speed of 10 frames per second (fps), demonstrating its effectiveness and efficiency in real-world applications.
The proposed pedestrian tracking system achieves a processing speed of 10 fps on standard GPU hardware. The combination of YOLOv8 for detection, ResNet-18 for feature extraction, and Deep SORT for tracking ensures accurate and reliable performance. CUDA and TensorRT optimizations significantly reduce latency in both model inference and auxiliary tasks, enabling real-time operation. Experiments demonstrate that the system effectively tracks multiple pedestrians in dynamic and cluttered environments, making it suitable for real-world applications such as surveillance and autonomous navigation.