Perceptron & Linear Classification — Lesson Content
Discover the foundation of all neural networks! Watch a perceptron learn to predict weather by drawing a line that separates rainy days from sunny days.
Meet the perceptron - the grandfather of all neural networks! In this interactive journey, you'll watch the simplest possible "brain" learn to predict weather using just humidity and temperature.
Starting with random guesses, the perceptron will make mistakes, learn from them, and gradually discover the perfect line that separates rainy days from sunny days. This is the same learning process that powers modern AI, just in its purest form.
Perfect for understanding how neural networks learn before diving into more complex architectures!
Step 1: Can We Predict Rain?
Imagine you want to predict the weather using just two measurements: **humidity** and **temperature**. You have 6 days of historical data.
Can you spot the pattern? Rainy days tend to have **high humidity** and **cooler temperatures**. Sunny days tend to have **lower humidity** and **warmer temperatures**.
**Our Dataset:**
- Day 1: 85% humidity, 18°C → **Rain**
- Day 2: 75% humidity, 16°C → **Rain**
- Day 3: 65% humidity, 22°C → **Rain**
- Day 4: 40% humidity, 30°C → **Sunny**
- Day 5: 45% humidity, 28°C → **Sunny**
- Day 6: 55% humidity, 25°C → **Sunny**
**The Goal:** Teach the simplest possible AI (a perceptron) to find this pattern automatically.
Can you see the two clusters on the scatter plot? Blue dots (rainy) tend to be in the upper-left region, yellow dots (sunny) in the lower-right.
Step 2: Meet the Perceptron — A Simple Voting Machine
A perceptron is like a simple voting machine. It takes two inputs (humidity and temperature), assigns each a **weight** (how important that input is), adds a **bias** (a personal lean toward yes or no), and makes a single yes/no decision.
Think of weights as opinions:
- A **large positive** humidity weight means "high humidity strongly suggests rain"
- A **negative** temperature weight means "high temperature suggests sunny"
The perceptron combines these opinions into a single number. If the number is **positive or zero**, it says "Rain!" Otherwise, "Sunny!"
**Current Perceptron State:**
- Weight for Humidity (w1): 0.30
- Weight for Temperature (w2): -0.40
- Bias: -0.10
These weights are starting guesses — let's see how they do!
Step 3: Making a Prediction — Following the Numbers
Let's trace through a single prediction using **Day 1** (85% humidity, 18°C).
**Step 1 — Multiply each input by its weight:**
- Humidity contribution: 0.85 × 0.30 = 0.255
- Temperature contribution: 0.45 × (-0.40) = -0.180
**Step 2 — Add the bias:**
- Weighted sum = 0.255 + (-0.180) + (-0.10) = **-0.025**
**Step 3 — Apply the decision rule:**
- -0.025 is **negative (<0)** → Predict **☀️ Sunny**
- But Day 1 is actually **🌧️ Rain** — the perceptron is **WRONG!**
The perceptron got it wrong because its weights are random starting guesses. It hasn't learned yet.
weighted_sum = x1 × w1 + x2 × w2 + bias
= 0.85 × 0.30 + 0.45 × (-0.40) + (-0.10)
= 0.255 + (-0.180) + (-0.10)
= -0.025
prediction = -0.025 < 0 → Sunny (WRONG)def predict(humidity, temperature, weights, bias):
x1 = humidity / 100 # normalize
x2 = temperature / 40 # normalize
weighted_sum = x1 * weights[0] + x2 * weights[1] + bias
return 1 if weighted_sum >= 0 else 0 # 1=Rain, 0=Sunny
Step 4: Testing All Data — How Bad Is It?
Let's run ALL 6 data points through the perceptron with its current weights:
- **Day 1** (85%, 18°C): Actual **Rain** → Predicted **Sunny** ❌
- **Day 2** (75%, 16°C): Actual **Rain** → Predicted **Sunny** ❌
- **Day 3** (65%, 22°C): Actual **Rain** → Predicted **Sunny** ❌
- **Day 4** (40%, 30°C): Actual **Sunny** → Predicted **Sunny** ✅
- **Day 5** (45%, 28°C): Actual **Sunny** → Predicted **Sunny** ✅
- **Day 6** (55%, 25°C): Actual **Sunny** → Predicted **Sunny** ✅
**Result: 3/6 correct (50%)** — no better than flipping a coin! The perceptron needs to learn from its mistakes.
Accuracy = correct predictions / total data points
= 3 / 6 = 50%Step 5: Learning from Mistakes — The Update Rule
When the perceptron makes a mistake, it adjusts its weights. The rule is beautifully simple:
**If wrong:** nudge each weight toward the correct answer.
`new_weight = old_weight + learning_rate × error × input`
Let's trace through a correction using **Day 1** (the mistake from Step 3):
- **Error** = actual - predicted = 1 - 0 = **+1** (should have been Rain)
- **Learning rate** = 0.2 (how big of a step to take)
**Weight updates:**
- w1: 0.300 + 0.2 × 1 × 0.85 = 0.300 + 0.170 = **0.470** (humidity matters MORE now)
- w2: -0.400 + 0.2 × 1 × 0.45 = -0.400 + 0.090 = **-0.310** (temperature penalty reduced)
- bias: -0.100 + 0.2 × 1 = -0.100 + 0.200 = **0.100**
Watch how the decision boundary shifts on the scatter plot!
Perceptron Learning Rule:
w_new = w_old + learning_rate × error × input
b_new = b_old + learning_rate × error
Where: error = target - prediction = 1 - 0 = 1
Applied to Day 1:
w1: 0.300 + 0.2 × 1 × 0.85 = 0.470
w2: -0.400 + 0.2 × 1 × 0.45 = -0.310
bias: -0.100 + 0.2 × 1 = 0.100
def learn(weights, bias, x1, x2, target, prediction, lr=0.2):
error = target - prediction
if error != 0: # Only update when wrong
weights[0] += lr * error * x1
weights[1] += lr * error * x2
bias += lr * error
return weights, bias
Step 6: Epoch 1 — First Full Pass
An **epoch** is one complete loop through all 6 data points. For each point, the perceptron predicts, checks if it's right, and updates weights if wrong.
**Epoch 1 Results:**
The perceptron processed all 6 points and made **2 mistakes** during the pass.
**After epoch 1:**
- Accuracy: **5/6 (83%)**
- w1 (humidity): 0.390
- w2 (temperature): -0.460
- bias: -0.100
That's an improvement from 50%! The boundary has shifted significantly from its random starting position.
But 83% isn't perfect — the edge cases near the boundary are the hardest to get right. Let's keep training.
Epoch 1 Summary:
Starting accuracy: 50% (3/6)
Ending accuracy: 83% (5/6)
Mistakes made: 2
Weight updates: 2
Weights after epoch 1:
w1 = 0.390, w2 = -0.460, bias = -0.100
Step 7: Epochs 2-11 — Getting Closer
Training continues. Each epoch, the perceptron processes all 6 points and adjusts weights when it makes mistakes.
**The boundary keeps adjusting:**
The perceptron is working on the hardest points — the ones closest to the decision boundary. Each epoch, the weights shift slightly in the right direction.
**Epoch 6:**
- Mistakes: 2
- Accuracy: 83% (5/6)
- Weights: [0.640, -0.660], bias: -0.100
**Epoch 11:**
- Mistakes: 2
- Accuracy: 100% (6/6)
- Weights: [0.840, -0.810], bias: -0.100
The edge cases near the boundary are the trickiest. Each epoch brings the weights closer to the solution.
Step 8: The Perceptron Has Learned!
**Convergence** means the perceptron has found weights that correctly classify ALL training data, and the weights stop changing.
**Final trained perceptron:**
- w1 (humidity): **0.840** — positive! High humidity predicts rain
- w2 (temperature): **-0.810** — negative! High temperature predicts sunny
- bias: **-0.100**
**What the weights mean:**
The perceptron discovered that humidity is a strong signal for rain (large positive weight), while high temperature signals against rain (negative weight). The bias fine-tunes where the dividing line sits.
**Converged in 12 epochs.**
The decision boundary now perfectly separates all rainy days from sunny days. The **Perceptron Convergence Theorem** guarantees this will always happen for linearly separable data!
Convergence Theorem:
If the data is linearly separable, the perceptron
learning algorithm is GUARANTEED to converge
in a finite number of steps.
Our perceptron converged in 12 epochs.
Final decision boundary:
0.840 × humidity + -0.810 × temperature + -0.100 = 0
Step 9: Prediction Time — Using the Trained Model
The real test of any AI: can it handle NEW data it hasn't seen before?
**New Day A: 80% humidity, 20°C**
- Normalized: (0.80, 0.50)
- Weighted sum = 0.80 × 0.840 + 0.50 × -0.810 + -0.100 = **0.167**
- Prediction: Positive → **🌧️ Rain**
**New Day B: 35% humidity, 32°C**
- Normalized: (0.35, 0.80)
- Weighted sum = 0.35 × 0.840 + 0.80 × -0.810 + -0.100 = **-0.454**
- Prediction: Negative → **☀️ Sunny**
The perceptron generalizes! It learned a RULE, not just memorized the training data. Any new point on the "rain side" of the boundary will be predicted as rain, and vice versa.
New Day A: (0.80, 0.50)
sum = 0.80 × 0.840 + 0.50 × -0.810 + -0.100 = 0.167 ≥ 0 → Rain
New Day B: (0.35, 0.80)
sum = 0.35 × 0.840 + 0.80 × -0.810 + -0.100 = -0.454 < 0 → Sunny
Step 10: Beyond the Straight Line
The perceptron has one fundamental limitation: **it can only draw straight lines.** This works perfectly for our weather data, but what about patterns that aren't linearly separable?
**The XOR Problem (a famous AI limitation):**
Imagine data where the pattern forms a checkerboard — no single straight line can separate the classes. The perceptron would fail here and never converge.
**The Solution: Neural Networks!**
Stack multiple perceptrons together in layers, and suddenly you can learn CURVED boundaries and complex patterns. Each neuron in a neural network is essentially a perceptron — the same building block you just learned.
**What you learned today:**
1. Binary classification separates data into two groups
2. A perceptron is the simplest neural network — one neuron
3. It learns by adjusting weights when it makes mistakes
4. Training = multiple epochs through the data
5. The convergence theorem guarantees it works (for linearly separable data)
6. Neural networks = many perceptrons working together
**Real-world applications:**
- Spam detection (word features → spam/not spam)
- Medical screening (symptoms → likely/unlikely condition)
- Credit scoring (financial features → approve/deny)
Step 11: Test Your Understanding
You've learned how a perceptron works from the ground up — from raw data to a trained model. Let's see how well you understood the key concepts!
Prerequisites
- Basic understanding of graphs and coordinates
- No advanced math required!
Key Concepts
- Linear Classification
- Decision Boundaries
- Perceptron Learning Rule
- Weight Updates
- Binary Classification