Generative Adversarial Network (GAN) (original) (raw)

Last Updated : 29 May, 2026

GANs are models that generate new, realistic data by learning from existing data. Introduced by Ian Goodfellow in 2014, they enable machines to create content like images, videos and music.

They are useful because:

Architecture of GAN

GAN consists of two neural networks the generator and the discriminator trained adversarially, where the generator tries to fool the discriminator and the discriminator tries to distinguish real from fake data.

1. Generator Model

The generator is a deep neural network that takes random noise as input to generate realistic data samples like images or text. It learns the underlying data patterns by adjusting its internal parameters during training through backpropagation. Its objective is to produce samples that the discriminator classifies as real.

**Generator Loss Function: The generator tries to minimize this loss:

J_{G} = -\frac{1}{m} \Sigma^m _{i=1} log D(G(z_{i}))

where:

The generator aims to maximize D(G(z_i)) meaning it wants the discriminator to classify its fake data as real (probability close to 1).

2. Discriminator Model

The discriminator is a binary classifier that distinguishes real data from generated samples. Through training, it refines its parameters to improve detection of fake data and when working with images, it uses convolutional layers to extract features and enhance classification accuracy.

**Discriminator Loss Function: The discriminator tries to minimize this loss:

J_{D} = -\frac{1}{m} \Sigma_{i=1}^m log\; D(x_{i}) - \frac{1}{m}\Sigma_{i=1}^m log(1 - D(G(z_{i}))

The discriminator wants to correctly classify real data as real (maximize log D(x_{i}) and fake data as fake (maximize log(1 - D(G(z_{i})))

MinMax Loss

GAN

GAN

GANs are trained using a MinMax Loss between the generator and discriminator:

min_{G}\;max_{D}(G,D) = [\mathbb{E}_{x∼p_{data}}[log\;D(x)] + \mathbb{E}_{z∼p_{z}(z)}[log(1 - D(g(z)))]

where:

The generator tries to minimize this loss (to fool the discriminator) and the discriminator tries to maximize it (to detect fakes accurately).

**Working of GAN

GAN train by having two networks the Generator (G) and the Discriminator (D) compete and improve together. Here's the step-by-step process

**1. Generator's First Move

The generator starts with a random noise vector like random numbers. It uses this noise as a starting point to create a fake data sample such as a generated image. The generator’s internal layers transform this noise into something that looks like real data.

**2. Discriminator's Turn

The discriminator receives two types of data:

D's job is to analyze each input and find whether it's real data or something G cooked up. It outputs a probability score between 0 and 1. A score of 1 shows the data is likely real and 0 suggests it's fake.

**3. Adversarial Learning

**4. Generator's Improvement

**5. Discriminator's Adaptation

**6. Training Progression

Types of GAN

There are several types of GANs each designed for different purposes. Here are some important types:

**1. Vanilla GAN

Vanilla GAN is the simplest type of GAN. It consists of:

**2. Conditional GAN (CGAN)

Conditional GAN (CGAN) adds an additional conditional parameter to guide the generation process. Instead of generating data randomly they allow the model to produce specific types of outputs. Working of CGANs:

**Example: Instead of generating any random image, CGAN can generate a specific object like a dog or a cat based on the label.

**3. Deep Convolutional GAN (DCGAN)

Deep Convolutional GAN (DCGAN) are among the most popular types of GANs used for image generation. They are important because they:

**4. Laplacian Pyramid GAN (LAPGAN)

Laplacian Pyramid GAN (LAPGAN) is designed to generate ultra-high-quality images by using a multi-resolution approach. Working of LAPGAN:

**5. Super Resolution GAN (SRGAN)

Super-Resolution GAN (SRGAN) is designed to increase the resolution of low-quality images while preserving details. Working of SRGAN:

Implementation

Generative Adversarial Networks (GAN) can generate realistic images by learning from existing image datasets. Here we will be implementing a GAN trained on the CIFAR-10 dataset using PyTorch.

**Step 1: Importing Required Libraries

We will be using Pytorch, Torchvision, Matplotlib and Numpy libraries for this. Set the device to GPU if available otherwise use CPU.

Python `

import torch import torch.nn as nn import torch.optim as optim import torchvision from torchvision import datasets, transforms import matplotlib.pyplot as plt import numpy as np

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

`

Step 2: Defining Image Transformations

We use PyTorch’s transforms to convert images to tensors and normalize pixel values between -1 and 1 for better training stability.

Python `

transform = transforms.Compose([ transforms.ToTensor(), transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)) ])

`

**Step 3: Loading the CIFAR-10 Dataset

Download and load the CIFAR-10 dataset with defined transformations. Use a DataLoader to process the dataset in mini-batches of size 32 and shuffle the data.

Python `

train_dataset = datasets.CIFAR10(root='./data',
train=True, download=True, transform=transform) dataloader = torch.utils.data.DataLoader(train_dataset,
batch_size=32, shuffle=True)

`

**Step 4: Defining GAN Hyperparameters

Set important training parameters:

latent_dim = 100 lr = 0.0002 beta1 = 0.5 beta2 = 0.999 num_epochs = 10

`

**Step 5: Building the Generator

Create a neural network that converts random noise into images. Use transpose convolutional layers, batch normalization and ReLU activations. The final layer uses Tanh activation to scale outputs to the range [-1, 1].

class Generator(nn.Module): def init(self, latent_dim): super(Generator, self).init()

    self.model = nn.Sequential(
        nn.Linear(latent_dim, 128 * 8 * 8),
        nn.ReLU(),
        nn.Unflatten(1, (128, 8, 8)),
        nn.Upsample(scale_factor=2),
        nn.Conv2d(128, 128, kernel_size=3, padding=1),
        nn.BatchNorm2d(128, momentum=0.78),
        nn.ReLU(),
        nn.Upsample(scale_factor=2),
        nn.Conv2d(128, 64, kernel_size=3, padding=1),
        nn.BatchNorm2d(64, momentum=0.78),
        nn.ReLU(),
        nn.Conv2d(64, 3, kernel_size=3, padding=1),
        nn.Tanh()
    )

def forward(self, z):
    img = self.model(z)
    return img

`

**Step 6: Building the Discriminator

Create a binary classifier network that distinguishes real from fake images. Use convolutional layers, batch normalization, dropout, LeakyReLU activation and a Sigmoid output layer to give a probability between 0 and 1.

class Discriminator(nn.Module): def init(self): super(Discriminator, self).init()

    self.model = nn.Sequential(
    nn.Conv2d(3, 32, kernel_size=3, stride=2, padding=1),
    nn.LeakyReLU(0.2),
    nn.Dropout(0.25),
    nn.Conv2d(32, 64, kernel_size=3, stride=2, padding=1),
    nn.ZeroPad2d((0, 1, 0, 1)),
    nn.BatchNorm2d(64, momentum=0.82),
    nn.LeakyReLU(0.25),
    nn.Dropout(0.25),
    nn.Conv2d(64, 128, kernel_size=3, stride=2, padding=1),
    nn.BatchNorm2d(128, momentum=0.82),
    nn.LeakyReLU(0.2),
    nn.Dropout(0.25),
    nn.Conv2d(128, 256, kernel_size=3, stride=1, padding=1),
    nn.BatchNorm2d(256, momentum=0.8),
    nn.LeakyReLU(0.25),
    nn.Dropout(0.25),
    nn.Flatten(),
    nn.Linear(256 * 5 * 5, 1),
    nn.Sigmoid()
)

def forward(self, img):
    validity = self.model(img)
    return validity

`

**Step 7: Initializing GAN Components

generator = Generator(latent_dim).to(device) discriminator = Discriminator().to(device)

adversarial_loss = nn.BCELoss()

optimizer_G = optim.Adam(generator.parameters()
, lr=lr, betas=(beta1, beta2)) optimizer_D = optim.Adam(discriminator.parameters()
, lr=lr, betas=(beta1, beta2))

`

**Step 8: Training the GAN

Train the discriminator on real and fake images, then update the generator to improve its fake image quality. Track losses and visualize generated images after each epoch.

for epoch in range(num_epochs): for i, batch in enumerate(dataloader):

    real_images = batch[0].to(device) 
   
    valid = torch.ones(real_images.size(0), 1, device=device)
    fake = torch.zeros(real_images.size(0), 1, device=device)
   
    real_images = real_images.to(device)

    optimizer_D.zero_grad()
   
    z = torch.randn(real_images.size(0), latent_dim, device=device)
  
    fake_images = generator(z)

    real_loss = adversarial_loss(discriminator\
                                 (real_images), valid)
    fake_loss = adversarial_loss(discriminator\
                                 (fake_images.detach()), fake)
    d_loss = (real_loss + fake_loss) / 2

    d_loss.backward()
    optimizer_D.step()

    optimizer_G.zero_grad()
  
    gen_images = generator(z)
    
    g_loss = adversarial_loss(discriminator(gen_images), valid)
    g_loss.backward()
    optimizer_G.step()
   
    if (i + 1) % 100 == 0:
        print(
            f"Epoch [{epoch+1}/{num_epochs}]\
                    Batch {i+1}/{len(dataloader)} "
            f"Discriminator Loss: {d_loss.item():.4f} "
            f"Generator Loss: {g_loss.item():.4f}"
        )
if (epoch + 1) % 10 == 0:
    with torch.no_grad():
        z = torch.randn(16, latent_dim, device=device)
        generated = generator(z).detach().cpu()
        grid = torchvision.utils.make_grid(generated,\
                                    nrow=4, normalize=True)
        plt.imshow(np.transpose(grid, (1, 2, 0)))
        plt.axis("off")
        plt.show()

`

**Output:

GAN-A

Training

GAN

Output

By following these steps we successfully implemented and trained a GAN that learns to generate realistic CIFAR-10 images through adversarial training.

You can download source code from here.

Applications

Advantages

Limitations