Overfitting & Regularization
Why Models Memorize and How to Make Them Generalize
What You'll Discover
Learn to spot overfitting and fix it with regularization
Spotting Overfitting
Read training vs validation loss curves to diagnose when your model memorizes instead of learns.
Bias-Variance Tradeoff
Understand why model complexity is a balancing act between underfitting and overfitting.
Dropout & Weight Decay
See how randomly disabling neurons and penalizing large weights prevent memorization.
Early Stopping
The simplest trick — stop training at the right moment before overfitting begins.
Key Concepts
Overfitting
Model memorizes training data, fails on new data
Bias-Variance Tradeoff
Balance between too simple and too complex
L2 Regularization
Penalizes large weights to keep them moderate
Dropout
Randomly disables neurons during training
Early Stopping
Stop training when validation loss rises
Generalization
The real goal — perform well on unseen data
Continue Learning
Fine-tune your models for best performance
What is Overfitting?
Imagine a student who memorizes every answer in the textbook but can't solve new problems on the exam.
That's overfitting! The model learns the training data too well — including its noise and quirks — and fails on new data it hasn't seen.
The goal isn't to memorize. It's to generalize.
The opposite problem is underfitting — like a student who barely studied and can't even answer the textbook questions.
The sweet spot? A model that learns the real patterns without memorizing the noise.
The Fitting Spectrum
| Problem | What Happens | Analogy |
|---|---|---|
| Underfitting | Too simple — misses the pattern | Student who didn't study |
| Good Fit | Captures the pattern, ignores noise | Student who understands concepts |
| Overfitting | Memorizes everything, even noise | Student who only memorized answers |
Key Signs
| Metric | Underfitting | Good Fit | Overfitting |
|---|---|---|---|
| Training Loss | High | Low | Very Low |
| Validation Loss | High | Low | High! |
| Gap | Small | Small | Large gap |