Recurrent Neural Networks

Networks That Remember: Processing Sequences Over Time

Difficulty
Intermediate
Duration
18-22 minutes
Prerequisites
Neural networks, Backpropagation

What You'll Discover

Understand how RNNs process sequential data with memory

Hidden State Memory

See how RNNs maintain context across time steps by passing hidden states forward.

Vanishing Gradients

Understand why vanilla RNNs struggle with long sequences and how gradients decay.

LSTM & GRU Gates

Learn how gating mechanisms control what to remember, forget, and output.

Real-World Applications

Match RNN architectures to tasks from sentiment analysis to machine translation.

Key Concepts

Sequential Data

Data where order matters: text, time series, audio, video

Hidden States

The RNN's memory that carries context between time steps

Vanishing Gradients

Gradients shrink exponentially, limiting long-range learning

LSTM

Long Short-Term Memory with forget, input, and output gates

GRU

Simplified gating with update and reset gates

Sequence-to-Sequence

Encoder-decoder architecture for translation and summarization

Step
1/ 8

Why Sequential Data Needs Special Networks

Many real-world data types have a natural order that matters:

  • Text: "Dog bites man" vs. "Man bites dog" — same words, opposite meaning
  • Time series: Stock prices, weather data, sensor readings
  • Audio: Speech is a sequence of sounds over time
  • Video: A sequence of image frames

A regular feedforward network treats all inputs independently — it has no concept of order. If you feed it the words "the cat sat" as three separate inputs, it doesn't know which came first, second, or third.

Recurrent Neural Networks (RNNs) solve this by processing inputs one at a time in order, maintaining a hidden state that carries information from previous steps. Think of it like reading a sentence: your understanding of each word is shaped by the words that came before it.

The key idea: instead of processing the entire sequence at once, process it step by step, building up context as you go.

Sequential Data in the Real World

Data TypeExampleWhy Order MattersTypical Task
Text"I love this movie"Word order determines meaningSentiment analysis
Time SeriesStock: 100, 105, 103, 110Trends depend on orderingPrice prediction
AudioSpeech waveformSounds must be in sequenceSpeech recognition
DNAATCGATCG...Gene sequences encode proteinsProtein structure prediction
MusicNotes over timeMelody = notes in orderMusic generation
VideoFrame 1, Frame 2, ...Actions unfold over timeActivity recognition

Feedforward vs Recurrent Networks

AspectFeedforward NetworkRecurrent Network
Input processingAll at once (parallel)One step at a time (sequential)
MemoryNone — each input independentHidden state carries context forward
Word order"cat sat the" = "the cat sat""the cat sat" ≠ "cat sat the"
Variable lengthFixed input size onlyHandles any sequence length
Best forImages, tabular dataText, time series, audio