Dec 27, 2024●11 reads●No License

Developer

a
@avlike

Development and Evaluation of GANs for Image Restoration

The source code of the project, as well as a detailed description in Ukrainian and instructions for running, training, and testing, can be found in my GitHub repository: https://github.com/andersenbel/rd9_mGAN.

Goal

To develop and fine-tune GANs for image restoration tasks, specifically for:

Super-resolution.
Restoration of damaged images.

Evaluate the model performance using the following metrics:

PSNR (Peak Signal-to-Noise Ratio).
SSIM (Structural Similarity Index).

Dataset Selection

We selected CIFAR-10 for the following reasons:

Size: CIFAR-10 is small (60,000 images), simplifying training and testing.
Simplicity: CIFAR-10 contains 32x32 images, which are easily upscaled to 128x128 for restoration tasks.
Accessibility: CIFAR-10 can be conveniently downloaded via torchvision, unlike CelebA, which requires additional tools for downloading.

GAN Architecture

Generator

The generator takes low-resolution input images (32x32) and restores them to high resolution (128x128).

Features:

Transposed convolutional layers (ConvTranspose2D) for upscaling.
Batch normalization (BatchNorm2D) for training stability.
Activation functions:
- ReLU for hidden layers.
- tanh for the output layer.

Discriminator

The discriminator classifies whether an image is real or generated.

Features:

Convolutional layers (Conv2D) for feature extraction.
Activation functions:
- LeakyReLU for hidden layers.
- sigmoid for classification.

Loss Functions and Optimization

Discriminator:
- Loss: Binary Cross-Entropy Loss (BCELoss).
- Optimizer: Adam with a learning rate of 1e-4.
Generator:
- Loss:
  - Mean Squared Error (MSE) for image restoration.
  - Adversarial Loss to deceive the discriminator.
- Optimizer: Adam with a learning rate of 1e-4.

Report: Training and Model Evaluation

Training

The models were trained with the following parameters:

Number of epochs: 50
Batch size: 16
Image resolution: Input images were reduced to 32x32, and output images were scaled to 128x128.
Model saving: Models were saved after each epoch in the format:
- generator_epoch_{number}.pth
- discriminator_epoch_{number}.pth

Project Files

Key Files:

train.py            # Code for training GAN
evaluate.py         # Code for evaluating models (GAN and SRGAN)
generator.py        # Generator architecture
discriminator.py    # Discriminator architecture
srgan.py            # SRGAN generator architecture (optional)

Comparative Report: Model Quality Assessment

PSNR and SSIM Metrics Table

Epoch	PSNR	SSIM
1	17.4899	0.7337
2	14.7227	0.8234
3	20.4272	0.8914
4	22.0976	0.9126
5	18.8058	0.7790
6	20.8179	0.9182
7	23.3236	0.9427
8	22.7589	0.9459
9	23.6007	0.9342
10	22.3043	0.9427

Visual Comparison of Image Restoration

Epoch 1

Epoch 1 - Image 0
Epoch 1 - Image 1

Epoch 2

Epoch 2 - Image 0
Epoch 2 - Image 1

Epoch 3

Epoch 3 - Image 0
Epoch 3 - Image 1

Epoch 4

Epoch 4 - Image 0
Epoch 4 - Image 1

Summary

GAN:
- Performs well in restoring overall image structure.
- Has limited detailing, especially for complex textures.
SRGAN:
- Provides better detailing due to its specialized architecture.
- Requires more resources for training.
Metrics improve with epochs:
- During early training (epochs 1-3), PSNR and SSIM values are low.
- By epochs 7-9, models show stable PSNR growth and high SSIM values.
Optimal quality:
- Highest PSNR (23.6007) and SSIM (0.9459) were achieved in epoch 9.
Visual comparison:
- Significant progress in detail restoration is visible with each successive epoch.

Results confirm the effectiveness of the model in restoring images from low resolution.

Installation and Execution

Creating and Activating the Environment

./setup_env.sh
source ./_env/bin/activate

1. Installing Required Libraries

pip install torch torchvision matplotlib gdown scikit-image

Training:

python train.py --dataset_path ./data --epochs 50 --batch_size 16

Evaluating Only GAN:

python train.py --dataset_path ./data --epochs 50 --batch_size 16 --resize 32

Evaluating GAN and SRGAN:

python evaluate.py --model_dir checkpoints --dataset_path ./data --batch_size 16 --max_images 5

Models

Dec 27, 2024●11 reads●No License

Developer

a
@avlike

Development and Evaluation of GANs for Image Restoration

Goal

To develop and fine-tune GANs for image restoration tasks, specifically for:

Super-resolution.
Restoration of damaged images.

Evaluate the model performance using the following metrics:

PSNR (Peak Signal-to-Noise Ratio).
SSIM (Structural Similarity Index).

Dataset Selection

We selected CIFAR-10 for the following reasons:

Size: CIFAR-10 is small (60,000 images), simplifying training and testing.
Simplicity: CIFAR-10 contains 32x32 images, which are easily upscaled to 128x128 for restoration tasks.
Accessibility: CIFAR-10 can be conveniently downloaded via torchvision, unlike CelebA, which requires additional tools for downloading.

GAN Architecture

Generator

The generator takes low-resolution input images (32x32) and restores them to high resolution (128x128).

Features:

Transposed convolutional layers (ConvTranspose2D) for upscaling.
Batch normalization (BatchNorm2D) for training stability.
Activation functions:
- ReLU for hidden layers.
- tanh for the output layer.

Discriminator

The discriminator classifies whether an image is real or generated.

Features:

Convolutional layers (Conv2D) for feature extraction.
Activation functions:
- LeakyReLU for hidden layers.
- sigmoid for classification.

Loss Functions and Optimization

Discriminator:
- Loss: Binary Cross-Entropy Loss (BCELoss).
- Optimizer: Adam with a learning rate of 1e-4.
Generator:
- Loss:
  - Mean Squared Error (MSE) for image restoration.
  - Adversarial Loss to deceive the discriminator.
- Optimizer: Adam with a learning rate of 1e-4.

Report: Training and Model Evaluation

Training

The models were trained with the following parameters:

Number of epochs: 50
Batch size: 16
Image resolution: Input images were reduced to 32x32, and output images were scaled to 128x128.
Model saving: Models were saved after each epoch in the format:
- generator_epoch_{number}.pth
- discriminator_epoch_{number}.pth

Project Files

Key Files:

train.py            # Code for training GAN
evaluate.py         # Code for evaluating models (GAN and SRGAN)
generator.py        # Generator architecture
discriminator.py    # Discriminator architecture
srgan.py            # SRGAN generator architecture (optional)

Comparative Report: Model Quality Assessment

PSNR and SSIM Metrics Table

Epoch	PSNR	SSIM
1	17.4899	0.7337
2	14.7227	0.8234
3	20.4272	0.8914
4	22.0976	0.9126
5	18.8058	0.7790
6	20.8179	0.9182
7	23.3236	0.9427
8	22.7589	0.9459
9	23.6007	0.9342
10	22.3043	0.9427

Visual Comparison of Image Restoration

Epoch 1

Epoch 1 - Image 0
Epoch 1 - Image 1

Epoch 2

Epoch 2 - Image 0
Epoch 2 - Image 1

Epoch 3

Epoch 3 - Image 0
Epoch 3 - Image 1

Epoch 4

Epoch 4 - Image 0
Epoch 4 - Image 1

Summary

GAN:
- Performs well in restoring overall image structure.
- Has limited detailing, especially for complex textures.
SRGAN:
- Provides better detailing due to its specialized architecture.
- Requires more resources for training.
Metrics improve with epochs:
- During early training (epochs 1-3), PSNR and SSIM values are low.
- By epochs 7-9, models show stable PSNR growth and high SSIM values.
Optimal quality:
- Highest PSNR (23.6007) and SSIM (0.9459) were achieved in epoch 9.
Visual comparison:
- Significant progress in detail restoration is visible with each successive epoch.

Results confirm the effectiveness of the model in restoring images from low resolution.

Installation and Execution

Creating and Activating the Environment

./setup_env.sh
source ./_env/bin/activate

1. Installing Required Libraries

pip install torch torchvision matplotlib gdown scikit-image

Training:

python train.py --dataset_path ./data --epochs 50 --batch_size 16

Evaluating Only GAN:

python train.py --dataset_path ./data --epochs 50 --batch_size 16 --resize 32

Evaluating GAN and SRGAN:

python evaluate.py --model_dir checkpoints --dataset_path ./data --batch_size 16 --max_images 5