GitHub Repository: https://github.com/realjero/KeypointRCNN
Mert Özmeral & Jerome Habanz
This project implements Keypoint RCNN using PyTorch, trained on the COCO 2017 dataset. It includes a training script for model optimization, a test script for evaluation, and a user-friendly testing framework for real-time webcam analysis. The training script refines model parameters for accurate keypoint detection. The test script assesses model performance on validation data or custom images. The testing framework enables interactive webcam testing, showcasing the model's real-world capabilities. Together, these components empower users in pose estimation and related computer vision tasks.
For training, we have used the COCO2017-Dataset
root
├───coco
│ ├───annotations
│ │ ├───person_keypoints_train2017.json
│ │ └───person_keypoints_val2017.json
│ ├───train2017
│ │ └───(...).jpg
│ └───val2017
│ └───(...).jpg
├───e42_b8_lr0.02_m0.9.pth
...
Make sure the required packages are installed!
python train.py
losses=9.61300 loss_classifier=0.76776 loss_box_reg=0.01154 loss_keypoint=8.07662 loss_objectness=0.69867 loss_rpn_box_reg=0.05841: 0%| | 0/42 [00:01<?, ?it/s]
...
losses=3.67900 loss_classifier=0.10942 loss_box_reg=0.16290 loss_keypoint=3.31874 loss_objectness=0.01995 loss_rpn_box_reg=0.06798: 100%|██████████| 42/42 [83:16:32<00:00, 7137.92s/it]
python test.py
Evaluation for *bbox*:
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.534
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.811
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.582
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.362
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.620
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.697
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.187
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.548
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.636
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.488
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.701
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.781
Evaluation for *keypoints*:
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets= 20 ] = 0.642
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets= 20 ] = 0.853
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets= 20 ] = 0.701
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.608
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.707
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 20 ] = 0.711
Average Recall (AR) @[ IoU=0.50 | area= all | maxDets= 20 ] = 0.904
Average Recall (AR) @[ IoU=0.75 | area= all | maxDets= 20 ] = 0.765
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.665
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.777
python detect.py
python detect.py --file image.jpg
python detect.py --file video.mp4
You can find the packages used in requirements.txt or down below⬇️
matplotlib
numpy
opencv_python
Pillow
pycocotools
torch
torchvision
tqdm