Undertook a comprehensive exploration of fake and real video datasets, employing advanced techniques in face detection, data preprocessing, and the creation of structured training, validation, and testing sets.This project holds significance as it served as the culmination of my Master's degree in Ottawa in 2023.
The deepfake detection system utilizes a multi-stage approach involving data preprocessing, feature extraction, deep learning-based classification, and a user-friendly web interface. It employs state-of-the-art algorithms to distinguish between authentic and manipulated videos, addressing the challenge of deepfake proliferation.
Data Collection
Data Exploration
Number of Fake Videos: 1000 Number of Real Videos: 1000
Video Processing
Capture one frame every 1 seconds Total number of videos: 1999 Total number of frames: 16370 Average frames per video: 8.189094547273637
Capture one frame every 2 seconds Total number of videos: 1999 Total number of frames: 7965 Average frames per video: 3.9844922461230614
Capture one frame every 4 seconds Total number of videos: 1999 Total number of frames: 3258 Average frames per video: 1.629814907453727
Video ID
, Frame ID
, Video Label
.cvlib
, resizing images to 300x300, and drawing rectangles around faces.Data Preprocessing
LabelEncoder
.Data Preparation
Normalized Frame
data and Labels
columns to TensorFlow tensors.Model Creation and Training
ResNet50, InceptionResNetV2, MobileNetV2, VGG16 models pre-trained on ImageNet.
Transfer learning with specific architectures(custom Layers)
x = GlobalAveragePooling2D()(resnet_model.output) x = Dense(512, activation='relu')(x) x = Dropout(0.5)(x) x = Dense(2, activation='softmax')(x)
Model compilation:
custom_optimizer = Adam(learning_rate=0.0001) model.compile(optimizer=custom_optimizer, loss='categorical_crossentropy', metrics=['accuracy'])
lr_schedule = ExponentialDecay(initial_learning_rate, decay_steps=100000, decay_rate=0.96, staircase=True) optimizer = Adam(learning_rate=lr_schedule) model.compile(optimizer=custom_optimizer, loss='categorical_crossentropy', metrics=['accuracy'])
sgd = SGD(lr=0.0001) # Stochastic Gradient Descent optimizer with a specific learning rate vgg_model_transfer.compile(loss='categorical_crossentropy', optimizer=sgd, metrics=['accuracy']) # Compile the model
Training details: epochs, batch size, early stopping.
epochs=100 batch_size=32 learning rate= 0.00001 early_stopping = EarlyStopping(monitor='val_loss', patience=7, restore_best_weights=True)
Evaluation and Result Analysis
Confusion matrix for video label determination: Calculated based on a specific threshold for determining the video label (REAL or FAKE) from the predicted frames.
Prediction Threshold:
Categorization of Videos:
Comparison with Actual Labels:
Cross-Validation
Soft Voting
Hyperparameters Tuning: on Chapion Model MobileNetV2 Model
Different Learning Rates: with batch size 32 and early stop after 5 epochs.
Different Batch Sizes: with Learning Rate 10^(-4) and early stop after 5 epochs.
Different number of epochs in early stop: with Learning Rate 10^(-4) and batch size =32.
Overall Comparison and The Superior Model
- Save the superior model for further development.