Explain DCGAN in detail?

board-infinity · 15 October 2022 10:39

DCGAN uses convolutional and convolutional-transpose layers in the generator and discriminator, respectively. It was proposed by Radford et. al. in the paper Unsupervised Representation Learning With Deep Convolutional Generative Adversarial Networks. Here the discriminator consists of strided convolution layers, batch normalization layers, and LeakyRelu as an activation function. It takes a 3x64x64 input image. The generator consists of convolutional-transpose layers, batch normalization layers, and ReLU activations. The output will be a 3x64x64 RGB image.

First, we will get the dataset from the drive.

from google.colab import drive
drive.mount("/content/drive")

The above code will create a folder names drive in colab and you can see the dataset is in your desired path. Then we need to extract the zip file. Create a folder named dataset and extract the data to that folder

# Rename the file name from "Copy of img_align_celeba.zip" to "img_align_celeba.zip"
!unzip /content/drive/MyDrive/img_align_celeba.zip -d "/content/dataset"

Let’s import the required modules

from __future__ import print_function
import argparse
import os
import random
import torch
import torch.nn as nn
import torch.nn.parallel
import torch.backends.cudnn as cudnn
import torch.optim as optim
import torch.utils.data
import torchvision.datasets as dset
import torchvision.transforms as transforms
import torchvision.utils as vutils
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.animation as animation
from IPython.display import HTML

Let’s define our inputs:

dataroot – the path to the root of the dataset folder.
workers – the number of worker threads for loading the data with the help of DataLoader
batch_size – the batch size used in training.
image_size – the spatial size of the images used for training.
nc – number of color channels in the input images.
nz – length of latent vector
ngf – relates to the depth of feature maps.
ndf – sets the depth of feature maps propagated through the discriminator
num_epochs – number of training epochs to run.
lr – learning rate for training.
beta1 – beta1 hyperparameter for Adam optimizers.
ngpu – number of GPUs available. 0 for cpu

dataroot = "/content/dataset"
workers = 2
batch_size = 128
image_size = 64
nc = 3
nz = 100
ngf = 64
ndf = 64
num_epochs = 5
lr = 0.0002
beta1 = 0.5
ngpu = 1

Here we will be using the ImageFolder dataset class, which requires a subdirectory in the dataset’s root folder. We can create the data loader, and visualize some of the training data.

dataset = dset.ImageFolder(root=dataroot,
transform=transforms.Compose([
transforms.Resize(image_size),
transforms.CenterCrop(image_size),
transforms.ToTensor(),
transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)),
                           ]))  
# Create the dataloader
dataloader = torch.utils.data.DataLoader(dataset, batch_size=batch_size,
                                         shuffle=True, num_workers=workers)
# Decide which device we want to run on
device = torch.device("cuda:0" if (torch.cuda.is_available() and ngpu > 0) else "cpu")
# Plot some training images
real_batch = next(iter(dataloader))
plt.figure(figsize=(8,8))
plt.axis("off")
plt.title("Training Images")
plt.imshow(np.transpose(vutils.make_grid(real_batch[0