Neural Network Forward Pass

Watch Data Flow Through a Neural Network

Difficulty
Beginner
Duration
15-20 minutes
Prerequisites
Basic functions

What You'll Discover

Follow a real-world example: predicting if a student will pass their test

Real-World Prediction

See how a neural network uses study hours, sleep, and past scores to predict if a student will pass or fail.

Network Architecture

Understand input, hidden, and output layers - and how neurons connect to form a decision-making system.

Input Normalization

Learn why raw data must be scaled to 0-1 range before feeding it into a neural network.

Pattern Detection

Watch hidden neurons compute weighted sums and detect patterns like "well-prepared student" or "needs more study."

Activation Functions

See how ReLU activation creates thresholds - neurons only "fire" when they detect strong enough patterns.

Making Predictions

Follow the final output layer as it combines pattern detections into a pass/fail prediction with confidence scores.

Key Concepts Covered

Forward Propagation

Data flows one direction through the network, from input to output, layer by layer

Weighted Sums

Each neuron multiplies inputs by learned weights and sums them up to detect patterns

Activation Functions

Non-linear functions like ReLU that let networks learn complex, non-linear patterns

Neural Network Layers

Input, hidden, and output layers each serve a distinct role in processing information

Input Normalization

Scaling raw data to 0-1 range so all features contribute equally to predictions

Prediction & Confidence

The output layer produces predictions with confidence scores based on learned patterns

Why This Matters

The forward pass is the fundamental operation of every neural network. Understanding it unlocks the rest of deep learning.

Foundation of AI

Every AI prediction uses this exact process

Image Recognition

Same principle powers computer vision systems

Language Models

ChatGPT uses forward passes billions of times

Self-Driving Cars

Sensor data flows through networks to make driving decisions

Step
1/ 8

The Problem: Will This Student Pass?

Meet Alex, a student preparing for tomorrow's exam. We have three pieces of information:

  • Hours Studied Today: 7 hours
  • Hours Slept Last Night: 8 hours
  • Previous Test Score: 85%

Can we predict if Alex will pass or fail tomorrow's test? We'll use a neural network that has been trained on data from thousands of students.

Let's walk through exactly how the network processes Alex's data, step by step.

Alex's Data

FeatureRaw ValueDescription
Hours Studied7Study time today (0-10 hrs)
Hours Slept8Sleep last night (0-10 hrs)
Previous Score85%Last test result (0-100%)

Neural Network Forward Pass — Lesson Content

Follow a real-world example: predicting if a student will pass their test. Watch how data flows through layers of neurons step by step.

Trace the complete forward pass of a neural network predicting whether a student named Alex will pass tomorrow's exam. Using Alex's study time, sleep, and previous score, you'll see how data flows from the input layer through hidden neurons to the final prediction. Each step breaks down the math — weighted sums, activation functions, and the final decision — so you understand exactly what happens inside a neural network. This is the same fundamental process used by all neural networks, from image classifiers to language models.

Learning Objectives

  • Understand how data flows through a neural network layer by layer
  • Learn what normalization is and why inputs need scaling
  • See how weighted sums combine inputs with learned importance
  • Understand the role of ReLU activation in adding non-linearity
  • Follow a complete forward pass from raw data to prediction

Step 1: The Problem: Will This Student Pass?

Meet **Alex**, a student preparing for tomorrow's exam. We have three pieces of information: - **Hours Studied Today**: 7 hours - **Hours Slept Last Night**: 8 hours - **Previous Test Score**: 85% Can we predict if Alex will **pass** or **fail** tomorrow's test? We'll use a neural network that has been trained on data from thousands of students. Let's walk through exactly how the network processes Alex's data, step by step.

Step 2: Network Architecture: Three Layers

Our neural network has three layers: - **Input Layer** (3 neurons): Receives Alex's data — one neuron per feature - **Hidden Layer** (4 neurons): Detects patterns and relationships between features - **Output Layer** (2 neurons): Produces a "Pass" score and a "Fail" score The lines connecting neurons are **weights** — they determine how much influence each input has. Green lines are positive (excitatory), red lines are negative (inhibitory), and thicker lines indicate stronger connections. This network learned these weights by training on thousands of past student records.

Step 3: Step 1: Feeding in Alex's Data

Neural networks work best with numbers between 0 and 1, so we **normalize** each feature: - **Hours Studied**: 7 / 10 = **0.70** - **Hours Slept**: 8 / 10 = **0.80** - **Previous Score**: 85 / 100 = **0.85** Each input neuron now holds one normalized value. This normalization ensures all features are on the same scale so no single feature dominates.
Normalization:
  x1 = 7 / 10  = 0.70
  x2 = 8 / 10  = 0.80
  x3 = 85 / 100 = 0.85
# Normalize inputs to 0-1 range
inputs = [
    hours_studied / 10,    # 0.70
    hours_slept / 10,      # 0.80
    previous_score / 100   # 0.85
]

Step 4: Step 2: Hidden Layer Weighted Sums

Each hidden neuron computes a **weighted sum** of all inputs plus a bias. Think of each neuron as a "pattern detector" — it amplifies inputs it considers important and diminishes ones it doesn't. **Hidden Neuron 1 (example):** - (0.70 x 0.8) + (0.80 x 0.4) + (0.85 x 0.7) + 0.1 = **1.575** All four hidden neurons compute similar weighted sums with their own weights, each looking for a different pattern in Alex's data.
For each hidden neuron j:
  z_j = sum(input_i * weight_ij) + bias_j

Neuron 1: (0.70 x 0.8) + (0.80 x 0.4) + (0.85 x 0.7) + 0.1 = 1.575
Neuron 2: 0.885
Neuron 3: 0.525
Neuron 4: 0.990
# Hidden layer: weighted sum
for j in range(hidden_size):
    z = bias_hidden[j]
    for i in range(input_size):
        z += inputs[i] * W_input_hidden[i][j]
    hidden_raw[j] = z

Step 5: Step 3: ReLU Activation

Raw weighted sums are passed through the **RELU** activation function. ReLU is simple: keep positive values, zero out negatives. This non-linearity is what lets neural networks learn complex, curved decision boundaries instead of just straight lines. **After activation:** - Neuron 1: 1.575 -> **1.575** (active!) - Neuron 2: 0.885 -> **0.885** (active!) - Neuron 3: 0.525 -> **0.525** (active!) - Neuron 4: 0.990 -> **0.990** (active!) Neurons that "fire" (output > 0) detected a strong-enough pattern in Alex's data. Neurons that output 0 didn't find their pattern.
ReLU(z) = max(0, z)

ReLU(1.575) = 1.575
ReLU(0.885) = 0.885
ReLU(0.525) = 0.525
ReLU(0.990) = 0.990
def relu(z):
    return max(0, z)

# Apply activation to hidden layer
hidden_activated = [relu(z) for z in hidden_raw]
# [1.575,0.885,0.525,0.99]

Step 6: Step 4: Output Layer Computation

The hidden layer activations now flow to the output layer. We have two output neurons: - **"Pass" neuron**: Aggregates evidence that Alex will pass - **"Fail" neuron**: Aggregates evidence that Alex will fail Each output neuron computes its own weighted sum from the hidden activations. Notice the weight pattern: the "Pass" neuron has **positive** weights (high hidden activations increase pass confidence) while the "Fail" neuron has **negative** weights (mirror image). **Raw output scores:** - Pass neuron: **3.444** - Fail neuron: **-3.444**
Pass = sum(hidden_j * w_j_pass) + bias_pass
     = (1.57 x 0.90) + (0.88 x 0.70) + (0.53 x 0.60) + (0.99 x 0.80) + 0.3
     = 3.444

Fail = sum(hidden_j * w_j_fail) + bias_fail
     = -3.444
# Compute output layer
for k in range(output_size):
    z = bias_output[k]
    for j in range(hidden_size):
        z += hidden_activated[j] * W_hidden_output[j][k]
    output_raw[k] = z

# pass_score = 3.444
# fail_score = -3.444

Step 7: The Prediction: Alex Will PASS!

After applying ReLU to the output layer, we get the final confidence scores: - **Pass**: 344.4% confidence - **Fail**: 0.0% confidence **Prediction: PASS** with 344.4% confidence! The network predicts success because: - 7 hours of study is substantial - 8 hours of sleep ensures sharpness - 85% previous score shows a strong foundation - These patterns match thousands of successful students the network trained on
Final scores (after ReLU):
  Pass confidence: 3.444 (344.4%)
  Fail confidence: 0.000 (0.0%)

Prediction: argmax(pass, fail) = PASS
# Apply activation and decide
pass_score = relu(output_raw[0])  # 3.444
fail_score = relu(output_raw[1])  # 0.000

if pass_score > fail_score:
    prediction = "PASS"
    confidence = pass_score
else:
    prediction = "FAIL"
    confidence = fail_score

print(f"Prediction: {prediction}")
print(f"Confidence: {confidence:.1%}")

Step 8: How Neural Networks Learn

You just watched a complete forward pass! Here's the big picture: **What we did (forward pass):** 1. **Normalized** raw data into 0-1 range 2. **Multiplied** inputs by learned weights and summed them 3. **Applied ReLU** to introduce non-linearity 4. **Repeated** for the output layer 5. **Picked the winner** as the prediction **How the network learned these weights (training):** 1. Start with random weights 2. Feed in a student's data and predict pass/fail 3. Compare prediction to the actual outcome 4. Use **backpropagation** to calculate how each weight contributed to the error 5. Adjust weights using **gradient descent** to reduce the error 6. Repeat with thousands of students until the weights converge **This same process powers:** - Image recognition (pixels -> object labels) - Language models like ChatGPT (text -> next word) - Self-driving cars (sensors -> driving decisions) The architecture you explored is the foundation of modern AI — more complex networks use the same principles with millions of neurons and billions of weights.

Prerequisites

  • Basic understanding of functions
  • No advanced math required!

Key Concepts

  • Forward Propagation
  • Neurons & Weights
  • Activation Functions
  • Pattern Recognition
  • Making Predictions