Image Classification one of the fundamental tasks for computer vision, which includes assigning labels to set of images based on the visual content. This project is Image classification -computer vision which focuses on classifying natural scenes using a deep learning -Convolutional Neural Network (CNN). The model is built using TensorFlow and leverages the pretrained ResNet50 architecture for transfer learning.
Project Objectives:
The model classifies images based on the visual content into predefined categories of different scene images around the world.
Tech Stack
Framework: TensorFlow with Keras API
Primary Model: ResNet50 (pretrained on ImageNet)
Development Environment: Google Colab
Language: Python 3.0
Dataset
This project consists of several images of scenes around the world. Dataset is the Intel Images dataset; these includes categories such as mountains, forests, beaches, etc.
Project Steps:
Data Preprocessing Pipeline
Model Architecture
With the TensorFlow framework and its Keras API integration along with ResNet50 architecture, we built a Convolutional Neural Network (CNN) using transfer learning. A CNN is created with this key building blocks:
Key Components
Base Model: ResNet50 pretrained on ImageNet
Additional Layers: Global Average Pooling and Dense layer with SoftMax activation
Training Configuration and process:
Batch size: 32
Learning rate: 0.01
Loss: Categorical Cross entropy
Optimizer: Adam
Metrics: Accuracy
In This project, we incorporated transfer learning: RESNET 50 with additional types of layers to build the model, which were connected sequentially. The sequential model takes an input, which in this case is an image. The image volumes are passed through the layers in sequence.
ResNet50 is known for its deep layers and residual connections. We used ResNet50 pretrained weights from the ImageNet dataset to initialize the model. The model processes 32 images at a time. Each image in the batch is 224x224 pixels. The final layers were fine-tuned specifically for the task of natural scene classification. Additionally, we used the SoftMax activation function in the final layer to output probabilities for each class.
ResNet50 is known for its deep layers and residual connections. We used ResNet50 pretrained weights from the ImageNet dataset to initialize the model. The model processes 32 images at a time. Each image in the batch is 224x224 pixels. The final layers were fine-tuned specifically for the task of natural scene classification. Additionally, we used the SoftMax activation function in the final layer to output probabilities for each class.
Results
The model was trained using 5 epoch to achieve an accuracy of 0.96 on the training set.
Additionally, the model performed well on the test set which is shown below with an accuracy of 0.9170 on the test set, with loss: 0.3627.
Additionally, the model performed well on the test set which is shown below with an accuracy of 0.9170 on the test set, with loss: 0.3627.
Conclusion
This project demonstrates the effectiveness of transfer learning using the ResNet50 model for image classification tasks, achieving high accuracy of 0.91 on the test set
Future Work/Improvement
Experiment with different architectures. (EfficientNet, VGG16)
Implement cross-validation
Add model interpretability using techniques like Grad-CAM
Optimize hyperparameters