Deep Learning Generative Adversarial Networks

Generative Adversarial Networks (GANs) produce entirely new data — realistic images, voices, videos, and more — that never existed before. Two neural networks compete against each other in a creative arms race, and the result is a generator capable of producing strikingly realistic synthetic content.

The Core Idea: Two Competing Networks

A GAN contains two networks with opposing goals:

Generator — tries to produce fake data so convincing that it passes as real
Discriminator — tries to tell the difference between real data and the Generator's fakes

The Counterfeiter vs. Detective Analogy

Counterfeiter (Generator):
  Start: makes crude fake banknotes
  Goal: produce notes so realistic the detective can't catch them

Detective (Discriminator):
  Start: easily spots crude fakes
  Goal: always identify counterfeits correctly

Round 1: Detective catches all fakes → Counterfeiter improves technique
Round 2: Some fakes slip through → Detective improves detection
Round 3: Better fakes, better detection → both improve
...
Endgame: Fakes are indistinguishable from real notes

GAN Architecture

Full Training Diagram

REAL DATA ─────────────────────────────────┐
                                           ↓
RANDOM NOISE → [GENERATOR] → FAKE DATA → [DISCRIMINATOR] → Real or Fake?
                    ↑                          │
                    │                          ↓
                    └──── Loss signal ←────────┘
                         (Generator improves)

Two loss signals flow:
  1. Discriminator loss: how well it separates real from fake
  2. Generator loss: how often its fakes fooled the discriminator

How Training Works Step by Step

Phase 1: Train the Discriminator

1. Show Discriminator real images → Label: REAL (1)
2. Show Discriminator Generator's fake images → Label: FAKE (0)
3. Discriminator learns to classify correctly
4. Update Discriminator weights only

Phase 2: Train the Generator

1. Generator creates fake images from random noise
2. Feed fakes to the (now-fixed) Discriminator
3. Discriminator classifies them
4. Generator's goal: make the Discriminator output REAL (1) for its fakes
5. Update Generator weights only

This alternates back and forth — train D, then train G, then train D, then G...
Both networks improve continuously until the Generator's output is indistinguishable.

The Loss Functions

Discriminator wants to maximize:
  → Correctly labeling real images as real
  → Correctly labeling fake images as fake

Generator wants to maximize:
  → Fooling the Discriminator into labeling its fakes as real

They have directly opposing objectives — hence "adversarial."

What GANs Produce

Image Synthesis

Input (Generator): random noise vector z = [0.34, -0.72, 0.88, ...]
Output: a photorealistic human face that does not belong to any real person

The website thispersondoesnotexist.com generates a new AI face every time you reload.
Every face is produced by a GAN from random noise.

Image-to-Image Translation (Pix2Pix)

Input:                             Output:
Rough sketch of a building    →    Photorealistic building rendering
Satellite map                 →    Street-level map view
Daytime photo                 →    Nighttime version of the same photo
Black-and-white photo         →    Colorized version

Style Transfer (CycleGAN)

Real photo of a horse   →   Same photo but looks like a zebra
Real photo of a summer  →   Same photo but looks like winter
Monet painting style    →   Applied to any landscape photo

GAN Challenges

Mode Collapse

Mode collapse happens when the Generator finds one type of output that reliably fools the Discriminator — and produces only that one output repeatedly, ignoring the diversity of real data.

Example:
  Dataset: photos of cats, dogs, and birds
  Generator (with mode collapse) → produces only cats
  The cats fool the Discriminator, so it stops trying new things

Fix: Wasserstein GAN (WGAN) and other training improvements

Training Instability

The two networks must improve at a similar pace.
If the Discriminator becomes too powerful too quickly:
  → Generator gets no useful feedback → cannot improve → training collapses

If the Generator becomes too powerful too quickly:
  → Discriminator cannot learn → training stalls

Solution: careful learning rate tuning, gradient clipping, batch normalization

Major GAN Variants

GAN Type	Innovation	Common Use
Vanilla GAN	Original 2014 design	Basic image synthesis
DCGAN	Uses convolutional layers	High-quality image generation
Conditional GAN (cGAN)	Adds a class label as input	Generate specific categories ("generate a dog")
CycleGAN	Unpaired image translation	Style transfer without matched pairs
StyleGAN	Fine control over output style	High-resolution face synthesis
WGAN	Better training stability	More reliable training on any data

Real-World GAN Applications

Drug Discovery — GANs generate candidate molecular structures for pharmaceutical research
Data Augmentation — medical imaging researchers generate synthetic X-rays to train diagnostic models when real data is scarce
Fashion Design — designers use GANs to generate new clothing designs and visualize variations
Video Game Asset Creation — texture generation and environment variation using GANs
Image Restoration — restoring old, damaged, or low-resolution photographs

GANs vs VAEs

Feature	GAN	VAE
Output sharpness	Sharp, photorealistic	Often slightly blurry
Training stability	Prone to instability	More stable
Latent space	Less structured	Smooth, interpolatable
Training approach	Adversarial	Reconstruction loss

Key Terms

GAN — Generative Adversarial Network — two competing networks that jointly learn to generate realistic data
Generator — produces fake data from random noise
Discriminator — classifies inputs as real or fake
Mode Collapse — Generator gets stuck producing only one type of output
Latent Vector (z) — the random noise input to the Generator
Conditional GAN — a GAN guided by class labels to generate specific categories

Previous lessons

Back to courses

Next lessons