Recurrent Neural Networks (RNN)

  • This lesson introduces Recurrent Neural Networks and explains how they process sequential data such as text and time series.
  • Recurrent Neural Networks (RNN)

    RNN is a type of neural network designed for sequential data — where order matters.

    Unlike CNN (for images), RNN works best for:

    • Text

    • Time series

    • Speech

    • Sequential patterns


    Sequential Data

    Sequential data = data where previous information affects next output.

    Examples:

    Example

    Why Sequential?

    Sentence

    Word order matters

    Stock prices

    Past price affects future

    Speech

    Previous sound affects next

    Temperature

    Time-dependent

    Example sentence:

    "I am learning Deep Learning"

    If we shuffle words → meaning changes.

    So order is important.


    RNN Architecture

    In a normal ANN:

    Input → Hidden → Output

    In RNN:

    Input₁ → Hidden₁ → Output₁

              ↓

    Input₂ → Hidden₂ → Output₂

              ↓

    Input₃ → Hidden₃ → Output₃

    The key idea:

    Hidden state from previous step is passed to next step.

    This gives RNN memory.


    Hidden State (Memory)

    Hidden state (hₜ) stores information from previous time step.

    Mathematical Form:

    ht=f(Wxt+Uht−1)h_t = f(Wx_t + Uh_{t-1})ht​=f(Wxt​+Uht−1​)

    Where:

    • xtx_txt​ = current input

    • ht−1h_{t-1}ht−1​ = previous hidden state

    • W,UW, UW,U = weight matrices

    • fff = activation (usually tanh)

    Simple Intuition

    If sentence is:

    "The movie was not good"

    When RNN reads "not", it remembers it when processing "good".

    Hidden state carries this context.


    Unrolling RNN

    RNN is the same network repeated over time.

    Example (3 time steps):

    x1 → [RNN] → h1

    x2 → [RNN] → h2

    x3 → [RNN] → h3

    Weights are shared at every time step.


    Vanishing Gradient in RNN

    During training (Backpropagation Through Time - BPTT):

    Gradients are multiplied repeatedly.

    If gradient < 1:
    It keeps shrinking → becomes almost zero.

    This is called:

    Vanishing Gradient Problem

    Result:

    • Model forgets long-term dependencies

    • Cannot learn long sequences

    Example:
    In long sentence:

    "The movie which I watched last year in London was not good"

    The word "not" may be forgotten.

    Solution:

    Advanced RNN variants:

    • LSTM (Long Short-Term Memory)

    • GRU (Gated Recurrent Unit)

    They solve vanishing gradient problem.


    Applications of RNN

    1. Text Processing

    • Sentiment Analysis

    • Machine Translation

    • Text Generation

    Example:
    Predict next word:
    "I love deep ___" → learning

    2. Speech Recognition

    Convert speech → text

    Used in:

    • Google Voice typing

    • Virtual assistants

    3. Time Series Forecasting

    • Stock prediction

    • Weather forecasting

    • Sales prediction

    4. Chatbots

    Conversation modeling