YOLOv8
In this project, I aim to implement a car object detection algorithm using the YOLO (You Only Look Once) architecture, known for its real-time speed and high accuracy. YOLO processes images in a single pass through the network, enabling it to detect objects quickly and efficiently. I will use a dataset containing images of cars from various angles to build and fine-tune the detection model. The goal is to achieve accurate detection of cars in real-world scenarios using this state-of-the-art algorithm.
What is YOLOv8 ?
YOLOv8 is a state-of-the-art computer vision model architecture developed by Ultralytics, the creators of YOLOv5. It is designed for real-time object detection and offers significant improvements in accuracy and speed over previous YOLO versions.
Key Features and Improvements:
Advanced Architecture: YOLOv8 incorporates a new backbone architecture, a redesigned neck, and improved head designs for enhanced performance.
Anchor-Free Detection: It adopts an anchor-free approach, eliminating the need for predefined anchor boxes, which improves generalization and learning speed.
Efficient Training and Inference: YOLOv8 is optimized for efficient training and inference, making it suitable for deployment on various devices, including NVIDIA Jetson, NVIDIA GPUs, and macOS systems with Roboflow Inference.
Versatile Applications: It can be used for a wide range of object detection tasks, including image and video analysis, surveillance systems, autonomous vehicles, and more.
How YOLOv8 Differs from Previous YOLO Versions:
Enhanced Performance: YOLOv8 achieves higher accuracy and faster inference speeds compared to YOLOv5 and other previous versions.
Anchor-Free Approach: The shift to anchor-free detection simplifies the training process and improves model generalization.
New Architecture: The redesigned backbone, neck, and head architectures contribute to the improved performance.
Wider Range of Applications: YOLOv8 can be applied to a broader range of object detection tasks due to its versatility.
!pip install ultralytics import os # For interacting with the operating system (e.g., file paths) import pandas as pd # For loading and manipulating CSV data (bounding boxes and image info) import numpy as np # For numerical operations and array handling import cv2 # For image processing and manipulation (OpenCV library) import matplotlib.pyplot as plt # For visualizing data and images (plots) import seaborn as sns # For advanced data visualization (especially histograms and distributions) from glob import glob # For finding all image files in a directory (using wildcard patterns) from sklearn.model_selection import train_test_split # For splitting the dataset into training and testing sets from PIL import Image from ultralytics import YOLO import warnings warnings.filterwarnings("ignore", "use_inf_as_na option is deprecated")
# Create necessary directories !mkdir -p "/kaggle/working/data" !mkdir -p "/kaggle/working/data/images" !mkdir -p "/kaggle/working/data/images/train" !mkdir -p "/kaggle/working/data/images/val" !mkdir -p "/kaggle/working/data/labels" !mkdir -p "/kaggle/working/data/labels/train" !mkdir -p "/kaggle/working/data/labels/val" root_dir = "/kaggle/working/data" labels_dir = "/kaggle/working/data/labels" images_dir = "/kaggle/working/data/images" # Define paths train_data = "/kaggle/input/car-object-detection/data/training_images" csv_data = "/kaggle/input/car-object-detection/data/train_solution_bounding_boxes (1).csv" test_data = "/kaggle/input/car-object-detection/data/testing_images"
# Loading the CSV data df = pd.read_csv(csv_data) # Display the first few rows of the dataframe to understand its structure df.head()
print(df.info())
# Distribution of bounding box sizes plt.figure(figsize=(12, 6)) df['width'] = df['xmax'] - df['xmin'] df['height'] = df['ymax'] - df['ymin'] sns.scatterplot(x='width', y='height', data=df) plt.title('Distribution of bounding box sizes') plt.xlabel('Width') plt.ylabel('Height') plt.show()
# Visualizing the distribution of bounding box widths and heights df['box_width'] = df['xmax'] - df['xmin'] df['box_height'] = df['ymax'] - df['ymin'] plt.figure(figsize=(12, 6)) plt.subplot(1, 2, 1) sns.histplot(df['box_width'], bins=20, kde=True) plt.title('Distribution of Bounding Box Widths') plt.subplot(1, 2, 2) sns.histplot(df['box_height'], bins=20, kde=True) plt.title('Distribution of Bounding Box Heights') plt.tight_layout(); plt.show();
YOLO
# Prepare YOLO format annotations def create_yolo_annotation(row, img_width, img_height): x_center = ((row['xmin'] + row['xmax']) / 2) / img_width y_center = ((row['ymin'] + row['ymax']) / 2) / img_height width = (row['xmax'] - row['xmin']) / img_width height = (row['ymax'] - row['ymin']) / img_height return f"0 {x_center} {y_center} {width} {height}" # Create YOLO annotations and copy images for img_name in df['image'].unique(): img_df = df[df['image'] == img_name] img_path = os.path.join(train_data, img_name) img = cv2.imread(img_path) if img is not None: img_height, img_width = img.shape[:2] # Decide whether to put in train or val folder if np.random.rand() < 0.8: # 80% train, 20% val subset = "train" else: subset = "val" # Copy image dst_img_path = os.path.join(images_dir, subset, img_name) cv2.imwrite(dst_img_path, img) # Create annotation file annotation_path = os.path.join(labels_dir, subset, f"{img_name.split('.')[0]}.txt") with open(annotation_path, 'w') as f: for _, row in img_df.iterrows(): yolo_annotation = create_yolo_annotation(row, img_width, img_height) f.write(yolo_annotation + '\n') # Create YAML configuration file yaml_content = f""" path: {root_dir} train: images/train val: images/val nc: 1 names: ['car'] """ with open('car_detection.yaml', 'w') as f: f.write(yaml_content) print("YAML configuration file created.")
model = YOLO('yolov8n.pt') # Disable W&B logging to avoid the API key prompt os.environ["WANDB_MODE"] = "disabled" results = model.train( data='car_detection.yaml', epochs=30, imgsz=640, batch=16, name='car_detection_model' )
model.save('car_detection_model.pt') print("Model saved successfully.")
The End