Machine Learning K-Nearest Neighbors

K-Nearest Neighbors (KNN) is one of the simplest Machine Learning algorithms. It makes predictions by looking at the K closest training records to a new data point and taking a majority vote (for classification) or an average (for regression). There is no separate training phase — the algorithm stores all training data and computes at prediction time.

The Core Idea: Neighbors Vote

Analogy:
  Moving to a new neighborhood.
  To know whether a street is safe, ask the 5 nearest neighbors.
  If 4 say "safe" and 1 says "unsafe" → it is probably safe.

KNN does exactly this with data points.

How KNN Works Step by Step

Dataset: Customer churn prediction
  Each customer has: (Age, Monthly Spend, Churned: Yes/No)

New Customer: Age=35, Monthly Spend=₹2000
  Goal: Will this customer churn?

Step 1: Calculate distance from new point to every training record
Step 2: Sort records by distance (closest first)
Step 3: Pick top K records (say K=5)
Step 4: Count class votes among those K neighbors
Step 5: Majority class = prediction

Distance Calculation

Euclidean Distance (most common):
  Distance = √((X2-X1)² + (Y2-Y1)²)

Example:
  New Customer:    Age=35, Spend=2000
  Training Point A: Age=32, Spend=1900
  Training Point B: Age=50, Spend=4500

  Distance to A = √((35-32)² + (2000-1900)²)
               = √(9 + 10000) = √10009 ≈ 100.05

  Distance to B = √((35-50)² + (2000-4500)²)
               = √(225 + 6250000) ≈ 2500.05

  Point A is much closer → more likely to vote.

Complete Prediction with K=5

5 Nearest Neighbors of New Customer:

┌──────────────┬─────┬───────┬─────────┬──────────┐
│ Neighbor     │ Age │ Spend │ Distance│ Churned? │
├──────────────┼─────┼───────┼─────────┼──────────┤
│ Customer 12  │ 33  │ 1950  │  51     │ No       │
│ Customer 7   │ 36  │ 2100  │ 101     │ No       │
│ Customer 45  │ 34  │ 1800  │ 201     │ Yes      │
│ Customer 31  │ 37  │ 2300  │ 302     │ No       │
│ Customer 18  │ 32  │ 2200  │ 361     │ No       │
└──────────────┴─────┴───────┴─────────┴──────────┘

Votes: No=4, Yes=1
Prediction: No (will NOT churn) ✓

Choosing K

┌────────┬──────────────────────────────────────────────────────┐
│ K      │ Effect                                               │
├────────┼──────────────────────────────────────────────────────┤
│ K = 1  │ Very sensitive — single nearest neighbor decides     │
│        │ Overfits — memorizes training data                   │
│ K = 3  │ Slightly more stable                                 │
│ K = 5  │ Common default — good balance                        │
│ K large│ Very smooth boundary — may underfit                  │
└────────┴──────────────────────────────────────────────────────┘

Rule of Thumb:
  K = √(number of training records)

For 100 records: K ≈ 10
For 10000 records: K ≈ 100

Always use odd K for binary classification to avoid ties.

Importance of Feature Scaling in KNN

Without Scaling:
  Age range:   18 – 65  (difference of 47)
  Salary range: 20,000 – 5,00,000  (difference of 4,80,000)

  Distance calculation heavily dominated by Salary.
  Age effectively has no influence at all.

With Scaling (normalize both to 0–1):
  Age and Salary both contribute equally to distance.
  More accurate neighbors.

ALWAYS scale features before applying KNN.

KNN for Regression

Instead of majority vote, take the average of K neighbors' values.

New house: Size=1500 sqft
5 Nearest Neighbors:
  House A: 1480 sqft → Price ₹2,40,000
  House B: 1510 sqft → Price ₹2,55,000
  House C: 1490 sqft → Price ₹2,48,000
  House D: 1520 sqft → Price ₹2,60,000
  House E: 1470 sqft → Price ₹2,38,000

Predicted Price = Average = ₹2,48,200

Advantages and Limitations of KNN

Advantages:
  ✓ No training phase — easy to add new data
  ✓ Simple and intuitive
  ✓ Naturally handles multi-class problems
  ✓ Good for non-linear boundaries

Limitations:
  ✗ Very slow at prediction time for large datasets
  ✗ Memory-heavy — stores all training data
  ✗ Sensitive to irrelevant features
  ✗ Must scale features before use
  ✗ Struggles with high-dimensional data (curse of dimensionality)

Previous lessons

Back to courses

Next lessons