Super Resolution GAN (SRGAN) (original) (raw)

Last Updated : 16 May, 2026

Super-Resolution Generative Adversarial Networks (SRGANs) are used for image upscaling by converting low-resolution images into sharper and more realistic high-resolution images while preserving important textures and details.

Architecture Overview

SRGAN follows the GAN framework using two neural networks, a generator and a discriminator. The generator converts low-resolution images into super-resolution images, while the discriminator distinguishes between real high-resolution images and generated images.

srgan_1

SRGAN-Architecture

Generator Architecture

The SRGAN generator uses a Residual Network (ResNet) architecture to generate high-resolution images effectively. Residual connections help improve gradient flow and support deeper network training.

gen-archtcture

Generator Architecture

Discriminator Architecture

The discriminator uses multiple convolutional layers to distinguish between real high-resolution images and generated images.

discriminator-network

Discriminator Architecture

Loss Function Design

SRGAN uses a perceptual loss function that combines content loss and adversarial loss to improve both image quality and realism.

Content Loss

Traditional super-resolution methods typically use Mean Squared Error (MSE) as the content loss, which measures pixel-wise differences between generated and target images. However, MSE tends to produce overly smooth images because it averages over all possible high-resolution images that could relate to a given low-resolution input.

l^{SR}_{VGG/i,j} = \frac{1}{W_{i,j} H_{i,j}} \sum_{x=1}^{W_{i,j}} \sum_{y=1}^{H_{i,j}} \left( \left( \phi_{i,j}(I^{HR})_{x,y} - \phi_{i,j}(G_{\theta_G}(I^{LR}))_{x,y} \right)^2 \right)

SRGAN proposes using VGG loss instead, which computes the difference between feature representations extracted from a pre-trained VGG-19 network. This approach focuses on perceptually important features rather than raw pixel values. The VGG loss can be computed at different network depths:

Adversarial Loss

Adversarial loss encourages the generator to produce images that appear realistic to the discriminator.

l^{SR}_{Gen} = \sum_{n=1}^{N} -\log D_{\theta_D}(G_{\theta_G}(I^{LR}))

Total Loss - Perceptual loss

l^{SR} = l^{SR}_X + 10^{-3} l^{SR}_{Gen}

Training Process and Results

During training, high-resolution images are downsampled to create low-resolution inputs for the generator. The generator and discriminator then train adversarially to improve image quality and realism.

Limitations

Although SRGAN produces high-quality images, it also has some limitations.

Applications

SRGAN is widely used in tasks where high visual quality is important.