This document details a project that employs deep learning techniques, specifically Convolutional Neural Networks (CNNs), to accurately classify plant diseases from image data. The implementation compares a custom-built CNN against a transfer learning model using VGG16.
The early and accurate detection of plant diseases is crucial for maintaining crop health and ensuring food security. This project addresses this challenge by leveraging deep learning to automate the identification of 38 different classes of plant diseases from images. By building and evaluating two distinct CNN architectures, the project demonstrates a robust methodology for creating effective classification models.
The project utilizes the "New Plant Diseases Dataset (Augmented)", a comprehensive collection of images for this task. The dataset is structured as follows:
To prepare the data for the models, the following steps were taken using ImageDataGenerator from TensorFlow/Keras:
224x224 pixels.[0, 255] range to [0, 1] to aid in model convergence.Two distinct models were developed and compared:
Custom CNN: A sequential model was constructed from scratch, featuring:
Conv2D layers with relu activation and L2 regularization (0.001).MaxPooling2D layers for down-sampling feature maps.BatchNormalization to stabilize and accelerate training.Flatten layer followed by Dense layers for classification.Dropout (0.5) to reduce overfitting.VGG16 (Transfer Learning): A more advanced model leveraging the VGG16 architecture, pre-trained on ImageNet.
VGG16 was used as a fixed feature extractor (base_model.trainable = False).GlobalAveragePooling2D layer was added, followed by Dense layers with BatchNormalization and Dropout (0.4) to adapt the model to the specific plant disease classes.0.001.categorical_crossentropy was employed, suitable for multi-class classification.EarlyStopping: Monitored validation loss and halted training after 10 epochs with no improvement, preventing overfitting.ModelCheckpoint: Saved the best-performing model weights based on validation performance.The training was configured to utilize available GPUs for accelerated computation.
The performance of both models was evaluated on the validation set using accuracy, precision, and recall. The final metrics from the training history are summarized below:
| Model | Training Accuracy | Validation Accuracy | Training Precision | Validation Precision | Training Recall | Validation Recall |
|---|---|---|---|---|---|---|
| CNN | 95.77% | 94.88% | 96.84% | 95.84% | 94.81% | 94.02% |
| VGG16 (Transfer Learning) | 98.12% | 97.51% | 98.43% | 97.80% | 97.81% | 97.27% |
The results indicate that the VGG16 transfer learning model outperformed the custom CNN across all metrics on both training and validation data, demonstrating the power of leveraging pre-trained weights for computer vision tasks.
This project successfully implements and validates deep learning models for plant disease classification. The superior performance of the VGG16 model highlights the effectiveness of transfer learning.
Future enhancements could include:
This work is licensed under the MIT License, encouraging further development and use by the community.