GCP GKE

Google Kubernetes Engine (GKE) is GCP's managed Kubernetes service. Kubernetes is an open-source system that automates the deployment, scaling, and management of containerized applications across a cluster of machines. GKE handles the setup and management of the Kubernetes control plane, so the focus stays on deploying applications rather than managing cluster infrastructure.

Imagine running a fleet of delivery trucks. Kubernetes is the logistics system — it decides which truck carries which cargo, monitors truck health, replaces broken trucks automatically, and scales the fleet up during busy seasons. GKE provides this system, already configured and ready to use.

Core Kubernetes Concepts

Cluster

A cluster is a group of machines (nodes) that run containerized applications managed by Kubernetes. A GKE cluster has two parts:

GKE Cluster
├── Control Plane (managed by Google)
│   ├── API Server        ← Accepts commands (kubectl, Console)
│   ├── Scheduler         ← Decides which node runs which pod
│   ├── Controller Manager ← Monitors and maintains desired state
│   └── etcd              ← Stores all cluster state and config
│
└── Worker Nodes (your VMs)
    ├── Node 1 (e2-standard-4)
    ├── Node 2 (e2-standard-4)
    └── Node 3 (e2-standard-4)

Pod

A pod is the smallest deployable unit in Kubernetes. Each pod runs one or more containers and shares the same network and storage. Think of a pod as a single instance of an application.

Deployment

A deployment manages a set of identical pods. It ensures the desired number of pods is always running. If a pod crashes, the deployment replaces it automatically.

Service

A Kubernetes Service exposes pods to network traffic. Since pod IP addresses change when pods restart, a Service provides a stable endpoint (IP address or DNS name) that always points to the correct pods.

Deployment: my-app
┌──────────────────────────────────────────────┐
│  Pod 1 (10.0.0.5)   ──┐                      │
│  Pod 2 (10.0.0.6)   ──┤──▶ Service: my-app   │
│  Pod 3 (10.0.0.7)   ──┘    (Stable IP/DNS)   │
└──────────────────────────────────────────────┘
                               │
                               ▼
                         External Traffic

Creating a GKE Cluster

# Create a standard cluster with 3 nodes
gcloud container clusters create my-cluster \
  --zone us-central1-a \
  --num-nodes 3 \
  --machine-type e2-standard-4

# Configure kubectl to connect to the cluster
gcloud container clusters get-credentials my-cluster \
  --zone us-central1-a

GKE also offers an Autopilot mode where Google manages node provisioning, scaling, and maintenance automatically — paying only per pod, not per node.

# Create an Autopilot cluster (fully managed by Google)
gcloud container clusters create-auto my-autopilot-cluster \
  --region us-central1

Deploying an Application to GKE

Step 1 – Write the Kubernetes Deployment YAML

# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-web-app
spec:
  replicas: 3
  selector:
    matchLabels:
      app: my-web-app
  template:
    metadata:
      labels:
        app: my-web-app
    spec:
      containers:
      - name: web
        image: gcr.io/my-project/my-web-app:v1
        ports:
        - containerPort: 8080
        resources:
          requests:
            cpu: "250m"
            memory: "256Mi"
          limits:
            cpu: "500m"
            memory: "512Mi"

Step 2 – Write the Kubernetes Service YAML

# service.yaml
apiVersion: v1
kind: Service
metadata:
  name: my-web-app-service
spec:
  type: LoadBalancer
  selector:
    app: my-web-app
  ports:
  - port: 80
    targetPort: 8080

Step 3 – Apply to the Cluster

# Apply the deployment
kubectl apply -f deployment.yaml

# Apply the service
kubectl apply -f service.yaml

# Check pods are running
kubectl get pods

# Get the external IP address of the service
kubectl get service my-web-app-service

Kubernetes Scaling

Manual Scaling

# Scale a deployment to 5 replicas
kubectl scale deployment my-web-app --replicas=5

Horizontal Pod Autoscaler (HPA)

HPA automatically adjusts the number of pods based on CPU or memory usage.

# Scale between 2 and 10 pods based on CPU usage
kubectl autoscale deployment my-web-app \
  --min=2 --max=10 --cpu-percent=70

# Or define in YAML:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: my-web-app-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-web-app
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70

Rolling Updates and Rollbacks

Kubernetes updates applications without downtime using rolling updates. It replaces pods one at a time, ensuring some pods always handle traffic during the update.

Before Update:
Pod v1 ✓   Pod v1 ✓   Pod v1 ✓

During Update (rolling):
Pod v2 ✓   Pod v1 ✓   Pod v1 ✓  ← v2 serves traffic while others update
Pod v2 ✓   Pod v2 ✓   Pod v1 ✓
Pod v2 ✓   Pod v2 ✓   Pod v2 ✓

After Update:
Pod v2 ✓   Pod v2 ✓   Pod v2 ✓

# Update the container image
kubectl set image deployment/my-web-app web=gcr.io/my-project/my-web-app:v2

# Roll back to the previous version if issues arise
kubectl rollout undo deployment/my-web-app

GKE Standard vs Autopilot

Feature	GKE Standard	GKE Autopilot
Node management	Manual — specify node count and type	Fully managed by Google
Billing	Per node (VM)	Per pod resource usage
Flexibility	Full control over node config	Limited — Google sets node specs
Best For	Teams needing full Kubernetes control	Teams wanting simplicity and lower ops burden

Key Takeaways

GKE is Google's managed Kubernetes service for running containers at scale.
A cluster has a Control Plane (managed by Google) and Worker Nodes (VMs).
Pods are the smallest deployable unit; Deployments manage groups of pods.
Services provide stable endpoints for accessing pods.
Horizontal Pod Autoscaler scales pods automatically based on CPU/memory.
Rolling updates deploy new versions without downtime.
Autopilot mode removes all node management responsibility.

Previous lesson

Back to course

Next lesson