Machine Learning Bias Variance Tradeoff
The bias-variance tradeoff is one of the most fundamental concepts in Machine Learning. Every model makes errors — understanding where those errors come from helps in building models that perform well on new, unseen data.
Two Sources of Prediction Error
Total Error = Bias² + Variance + Irreducible Noise Bias: Error from wrong assumptions in the model Variance: Error from sensitivity to small changes in training data Noise: Random error from the data itself (cannot be eliminated)
Bias Explained
High Bias = Model is too simple to capture real patterns. It makes strong incorrect assumptions about the data. Analogy: A doctor always diagnoses "common cold" no matter what symptoms appear. That doctor has high bias — always defaulting to the same answer. High Bias in ML: True pattern: Y = X² + 3X + noise (curved) Model assumes: Y = mX + b (straight line) The line misses the curve completely. Model underfits — bad on both training AND test data. Low Bias: Model is flexible enough to capture the real relationship.
Variance Explained
High Variance = Model is too complex and learns noise as if it were patterns. Small changes in training data change the model dramatically. Analogy: A student memorizes every example from the textbook word-for-word. When the exam uses different phrasing, the student fails. The student "overfit" to the exact training examples. High Variance in ML: Model perfectly fits all 100 training records (100% accuracy) On 20 new test records: only 60% accuracy The model memorized training noise instead of real patterns. Low Variance: Model is stable — similar training sets produce similar models.
The Tradeoff Diagram
Model Complexity vs Error: Error │ │ ● ← High Bias error │ ● │ ● ← Optimal region (sweet spot) │ ● ● │ ● ● ← High Variance error │ ● ● │ ● └────────────────────────────────────────────────► Simple Complex Model Model (High Bias) (High Variance) As model complexity increases: Bias decreases (model becomes more flexible) Variance increases (model becomes more sensitive) Goal: Find the point where total error (Bias + Variance) is minimized.
Identifying High Bias vs High Variance
┌─────────────────────────┬────────────────────┬─────────────────────┐ │ Symptom │ High Bias │ High Variance │ ├─────────────────────────┼────────────────────┼─────────────────────┤ │ Training accuracy │ Low │ High (near 100%) │ │ Test accuracy │ Low │ Much lower than │ │ │ │ training │ │ Gap between train/test │ Small (both bad) │ Large │ │ Model name │ Underfitting │ Overfitting │ │ Fix │ More complex model │ Simpler model / │ │ │ More features │ more data / │ │ │ │ regularization │ └─────────────────────────┴────────────────────┴─────────────────────┘
