GCP Cloud Load Balancing

Cloud Load Balancing distributes incoming network traffic across multiple backend instances (VMs, containers, or serverless services). The goal is to ensure no single instance is overwhelmed, maximize availability, and route users to the nearest healthy backend for the best performance.

Think of a load balancer like a bank teller supervisor. When 100 customers walk in, the supervisor directs each person to the next available teller instead of sending everyone to the same one. If a teller is on break (instance is unhealthy), no more customers are sent there.

Why Load Balancing Matters

Without Load Balancing:
All traffic → Single VM → VM overloaded → Slow responses / Crashes

With Load Balancing:
Traffic → Load Balancer
              │
    ┌─────────┼─────────┐
    ▼         ▼         ▼
  VM 1      VM 2      VM 3
(healthy) (healthy) (healthy)
Each handles a portion of traffic equally

Types of GCP Load Balancers

Load Balancer	Layer	Protocol	Use Case
Global External Application LB	7 (HTTP)	HTTP/HTTPS	Web apps, APIs with HTTPS and global routing
Regional External Application LB	7 (HTTP)	HTTP/HTTPS	Web apps limited to one region
External Passthrough Network LB	4 (TCP/UDP)	TCP, UDP	Game servers, non-HTTP protocols
Internal Application LB	7 (HTTP)	HTTP/HTTPS	Internal microservices within a VPC
Internal Passthrough Network LB	4 (TCP/UDP)	TCP, UDP	Internal services needing any TCP/UDP protocol

Global External Application Load Balancer Architecture

User in India                  User in USA
     │                              │
     ▼                              ▼
GCP Edge (Mumbai)           GCP Edge (Iowa)
     │                              │
     └──────────────┬───────────────┘
                    ▼
        Global Load Balancer (Anycast IP)
                    │
         URL Map (routes traffic)
                    │
        ┌───────────┴───────────┐
        ▼                       ▼
Backend Service A          Backend Service B
(for /api/* paths)         (for /static/* paths)
        │                       │
  ┌─────┴─────┐         Cloud Storage Bucket
  VM 1      VM 2
(us-central1-a) (us-central1-b)

The Global LB uses a single Anycast IP address that routes users to the nearest GCP edge location worldwide. Traffic never travels farther than necessary.

Setting Up an HTTP Load Balancer

Key Components

Component	Role
Frontend	The public IP and port that users connect to
URL Map	Routes requests to different backends based on URL paths or hostnames
Backend Service	Defines which backends serve traffic and how health checks work
Backend (Instance Group)	The VMs, containers, or serverless services that handle requests
Health Check	Determines if a backend is healthy and should receive traffic

Via Cloud Shell (CLI Setup)

# Step 1 — Create an instance group (backend VMs)
gcloud compute instance-groups managed create web-servers \
  --zone=us-central1-a \
  --size=3 \
  --template=web-server-template

# Step 2 — Create a health check
gcloud compute health-checks create http http-basic-check \
  --port 80 \
  --request-path /health

# Step 3 — Create a backend service
gcloud compute backend-services create web-backend \
  --protocol=HTTP \
  --health-checks=http-basic-check \
  --global

# Step 4 — Add the instance group to the backend service
gcloud compute backend-services add-backend web-backend \
  --instance-group=web-servers \
  --instance-group-zone=us-central1-a \
  --global

# Step 5 — Create a URL map
gcloud compute url-maps create web-map \
  --default-service web-backend

# Step 6 — Create a target HTTP proxy
gcloud compute target-http-proxies create http-proxy \
  --url-map=web-map

# Step 7 — Create a forwarding rule (public IP + port 80)
gcloud compute forwarding-rules create http-rule \
  --global \
  --target-http-proxy=http-proxy \
  --ports=80

Health Checks

A health check regularly sends a probe request to each backend. If a backend fails to respond correctly, the load balancer stops sending traffic to it.

Load Balancer sends HTTP GET /health to each VM every 5 seconds
        │
VM responds with HTTP 200 OK → Healthy ✓ (receives traffic)
VM responds with HTTP 500    → Unhealthy ✗ (removed from rotation)
VM does not respond          → Unhealthy ✗ (removed from rotation)

# Health check configuration
gcloud compute health-checks create http my-health-check \
  --port=8080 \
  --request-path=/health \
  --check-interval=10s \
  --timeout=5s \
  --healthy-threshold=2 \
  --unhealthy-threshold=3

SSL/TLS Termination

For HTTPS, the load balancer handles SSL termination — it decrypts the HTTPS traffic at the edge and forwards plain HTTP to backend VMs. This means VMs do not need to handle SSL processing, reducing their CPU load.

User (HTTPS) ──▶ Load Balancer (SSL terminates here)
                        │
                        │ HTTP (unencrypted, internal)
                        ▼
                     Backend VMs

# Create an SSL certificate (managed by Google)
gcloud compute ssl-certificates create my-ssl-cert \
  --domains=www.mysite.com

# Create HTTPS frontend
gcloud compute target-https-proxies create https-proxy \
  --url-map=web-map \
  --ssl-certificates=my-ssl-cert

gcloud compute forwarding-rules create https-rule \
  --global \
  --target-https-proxy=https-proxy \
  --ports=443

URL-Based Routing

One load balancer can route requests to different backends based on URL paths:

www.mysite.com/         → Web Server Backend (serves HTML pages)
www.mysite.com/api/*    → API Backend (Node.js microservice)
www.mysite.com/static/* → Cloud Storage Bucket (images, CSS, JS)

This is configured in the URL Map using path matchers and route rules.

Key Takeaways

Cloud Load Balancing distributes traffic across multiple backends for availability and performance.
The Global External Application LB uses a single Anycast IP for worldwide routing.
Health checks automatically remove unhealthy backends from the traffic rotation.
SSL termination at the load balancer reduces processing load on backend VMs.
URL-based routing sends different request paths to different backend services.
Internal load balancers route traffic within a VPC for microservice communication.

Previous lesson

Back to course

Next lesson