This study explores the Intel Image Classification dataset to evaluate and compare the performance of Convolutional Neural Networks (CNNs) and Graph Convolutional Networks (GCNs) for image classification. The dataset consists of 25,000 images distributed across six categories: buildings, forest, glacier, mountain, sea, and street. Leveraging CNNs, Delaunay triangulation, and SLIC segmentation techniques, we achieved accuracies of 98%, 53%, 70%, and 73% (with data augmentation) respectively. These results underscore the strengths and limitations of each approach while providing insights for future advancements in image classification tasks.
The Intel Image Classification dataset contains around 25,000 images, each of size 150x150 pixels, categorized into six distinct classes:
Buildings: Label 0
Forest: Label 1
Glacier: Label 2
Mountain: Label 3
Sea: Label 4
Street: Label 5
The data is divided into three subsets:
Training Set: ~14,000 images
Test Set: ~3,000 images
Prediction Set: ~7,000 images
The dataset was initially published on DataHack by Intel for an image classification challenge. It provides an excellent opportunity to develop neural network models for high-accuracy classification tasks.
Input: 100x100x3 image
Layers: Convolutional layers with ReLU activation, max-pooling, fully connected layers, and softmax for classification.
Output: Six-class probability distribution
Input: Graphs generated from image regions using Delaunay triangulation.
Layers: Graph convolutional layers with BatchNorm and ReLU, global pooling, and fully connected layers.
Output: Six-class probability distribution.
Input: Graphs created from superpixels using SLIC segmentation.
Layers: Similar to the GCN with Delaunay triangulation.
Output: Six-class probability distribution.
Enhancements: Augmented data using transformations like rotation, scaling, and flipping.
Model: GCN with SLIC architecture.
The performance of each model was evaluated on the test set, with accuracies as follows:
CNN: 98%
GCN with Delaunay Triangulation: 53%
GCN with SLIC Segmentation: 70%
GCN with SLIC and Data Augmentation: 73%
Confusion matrix for CNN
Confusion matrix for Delaunay Triangulation
Confusion matrix for SLIC
Strengths:
Limitations
The study demonstrates the superior performance of CNNs for the Intel Image Classification dataset. GCNs, while promising, require further optimization and better graph construction techniques to compete with CNNs. SLIC segmentation proved more effective than Delaunay triangulation, with data augmentation providing additional gains.