In this study, we investigate the capabilities of Stable Diffusion for data augmentation in deep learning-based visual inspection of electromagnetic coils. We first construct two base models: a transformer-based model, Distillation with no labels Version 2 using ViT-Large architecture (DINOv2-L), and a CNNs-based model, EfficientNet Version 2 Large variant (EfficientNetV2-L). After developing the base models, we generate images using Stable Diffusionβs img2img and Dreamboothβs txt2img techniques. These generated images are then added to the training dataset to evaluate whether there is an improvement in the F1-score.
ai-faps-ekaansh-khosla/
βββ Base_model/ # all files of Base model
β βββ labels/ # all files of labels for Base model
| | βββ splits/ # all splits of train.csv
| | | βββ train_half.csv # 50% of train.csv
| | | βββ train_quarter.csv # 25% of train.csv
| | | βββ train_ten.csv # 10% of train.csv
| | βββ analysis_of_coils.pdf # Analysis of coils, how splits are chosen
| | βββ analysis_of_splits.xlsx # Analysis of all splits(train, validation, test)
| | βββ test.csv # test file
| | βββ train.csv # train file
| | βββ validation.csv # validation file
β βββ models/ # DinoV2_L and EfficientNet_V2_L results
β βββ DinoV2_L/ # all files of DinoV2_L
| | βββ 10%_data/ # Optuna and test results using 10% training data
| | βββ 100%_data/ # Optuna and test results using 100% training data
| | βββ 25%_data/ # Optuna and test results using 25% training data
| | βββ 50%_data/ # Optuna and test results using 50% training data
| | βββ Freezing_Layers_experiments/ # DinoV2-L layer freezing experiments
| | βββ Visualization_dinoV2_l/ # Visualization of Optuna Study
β βββ EfficientNet_V2_L/ # all files of EfficientNet_V2_L
| βββ 10%_data/ # Optuna and test results using 10% training data
| βββ 100%_data/ # Optuna and test results using 100% training data
| βββ 25%_data/ # Optuna and test results using 25% training data
| βββ 50%_data/ # Optuna and test results using 50% training data
| βββ Visualization_efficientNet_l/ # Visualization of Optuna Study
|
βββ Modular_code/ # Modular code files for reproducibility
| βββ Optuna/ # all files for Optuna
| β βββ config.py # Define all training labels and image paths
| β βββ data_loader.py # load data
| β βββ main.py # run this file to have optuna study
| β βββ model_dino.py # Define DINOv2-L model
| β βββ model_efficientnet.py # Define EfficientNetV2-L model
| β βββ optimization.py # Define Objective function of Optuna
| β βββ trainer.py # Training loop
| βββ Testing/ # all files for Testing a model
| βββ config.py # Define all training labels and image paths
| βββ data_loader.py # load data
| βββ main.py # run this file for testing a model
| βββ model_dino.py # Define DINOv2-L model
| βββ model_efficientnet.py # Define EfficientNetV2-L model
| βββ train_and_test.py # Training and Testing loop
|
βββ stable_diffusion_enhanced_models/ # all files of model enhanced by Augmented images
| βββ dreambooth_txt2img/ # all files of Dreambooth technique
| β βββ calculating_FID_values/ # all files of calculating FID files
| | | βββ calculate FID.ipynb # calculating FID Code
| | | βββ fid_values_Dreambooth_txt2img.csv # Results of FID
| | | βββ get_random500_images.ipynb # filtering images to get one type of defect
| β βββ data_transformation_files/ # all files for data transformation for Dreambooth
| | | βββ defects_split/ # Excel files for each type of defect
| | | βββ get_required_images.ipynb # code for seperating images of each defect
| | | βββ randomly_selecting_100_images.ipynb # selection 100 images for Dreambooth training
| | | βββ resize_images-720x468.ipynb # convert images from 468x468 to 720x468
| | | βββ resize_images_468x468.ipynb # convert images from 720x468 to 468x468
| β βββ labels/ # all labels including augmented images
| | | βββ train_ten_25_text2img_dreambooth.csv # labels of 10% to 25% data
| β βββ models/ # Results of model enhanced by dreambooth
| | | βββ DinoV2_L/10%-25%/ # DinoV2_L results
| | | βββ EfficientNetV2_L/10%-25%/ # EfficientNetV2_L results
| β βββ Apply_configuration.ipynb # Applying configuration before running Dreambooth
| β βββ dreambooth_txt2img.py # Dreambooth training file
| βββ stable_diffusion_img2img/ # all files of stable_diffusion_img2img technique
| βββ calculating_FID_values/ # all files of calculating FID files
| | βββ calculate FID.ipynb # calculating FID Code
| | βββ fid_values_SD_img2img.csv # Results of FID
| | βββ get_random500_images.ipynb # filtering images to get one type of defect
| βββ labels/ # all labels including augmented images
| | βββ 100_200_augmented.csv # labels of 100% to 200% data
| | βββ 10_25_augmented.csv # labels of 10% to 25% data
| | βββ 10_50_augmented.csv # labels of 10% to 50% data
| | βββ 25_50_augmented.csv # labels of 25% to 50% data
| | βββ 50_100_augmented.csv # labels of 50% to 100% data
| βββ models/ # Results-model enhanced by stable_diffusion_img2img
| | βββ DINOv2/ # DinoV2_L results
| | βββ EfficientNetV2/ # EfficientNetV2_L results
| βββ img2img.py # main file changed in CompVis/stable-diffusion
βββ README.md # README
βββ Results_base_model.png # Results summary of base model
βββ Results_base_model.xlsx # Results summary of base model
βββ Results_stable_diffusion_img2img.png # Results summary of stable_diffusion_img2img
βββ Results_stable_diffusion_img2img.xlsx # Results summary of stable_diffusion_img2img
βββ environment.yml # environment file
βββ overview.png # Results summary of stable_diffusion_img2img
βββ requirements.txt # requirements file
conda env create -f environment.yml
pip install -r requirements.txt
Modular_code/Optuna
directory.config.py
with the necessary training labels and image paths.python main.py
Modular_code/Testing
directory.config.py
with the testing labels and image paths.python main.py
Clone the Stable Diffusion repository:
git clone https://github.com/CompVis/stable-diffusion.git cd stable-diffusion
Download the pre-trained model (sd-v1-4.ckpt
) from Hugging Face and place it in the root directory of the cloned repository.
Run the img2img script with the following command:
python scripts/img2img.py --prompt "Image similar to this" --ckpt sd-v1-4.ckpt --skip_grid --n_samples 1 --n_iter 4 --precision autocast --strength 0.05 --ddim_steps 500
python dreambooth.py