Next

Convolutional Neural Networks (CNN)

  • This lesson explains Convolutional Neural Networks and how they are used in image recognition and computer vision applications.
  • Why CNN?

    Problem with Normal ANN for Images

    If image size = 224 × 224 × 3

    Total input neurons = 150,528

    Fully connected layer =
    Too many parameters
    Overfitting
    Very slow training


    Why CNN is Better?

    Uses local connections
    Shares weights (same filter across image)
    Detects spatial patterns
    Less parameters
    Best for image tasks

    CNN automatically learns:

    • Edges

    • Shapes

    • Textures

    • Objects


    Convolution Operation

    Convolution = Applying a small filter (kernel) over image.

    Example:

    Input Image (5×5)
    Filter (3×3)

    Filter slides over image and performs:

    Output=∑(Image×Filter)\text{Output} = \sum (Image \times Filter)Output=∑(Image×Filter)

    This creates a feature map.

    Intuition:

    Filter detects specific patterns:

    • Vertical edge

    • Horizontal edge

    • Corners


    Filters & Feature Maps

    Filter (Kernel)

    Small matrix like:

    • 3×3

    • 5×5

    Each filter detects one feature.

    Feature Map

    Output generated after applying filter.

    If we use:

    • 10 filters → 10 feature maps

    More filters = more learned features.


    Padding & Stride

    Stride

    Stride = number of pixels filter moves.

    • Stride = 1 → moves 1 step

    • Stride = 2 → skips pixels

    Higher stride → smaller output

    Padding

    Padding = adding zeros around image.

    Types:

    • Valid Padding → No padding (output shrinks)

    • Same Padding → Add zeros to keep same size

    Example:

    Without padding:
    5×5 → 3×3

    With padding:
    5×5 → 5×5


    Pooling Layer (Downsampling)

    Pooling reduces image size and computation.

    Max Pooling

    Takes maximum value in window.

    Example (2×2):

    1 3

    4 2

    Output = 4

    Most commonly used.

    Average Pooling

    Takes average value.

    1 3

    4 2

    Output = 2.5


    Why Pooling?

    Reduces computation
    Reduces overfitting
    Makes model translation invariant


    CNN Architecture

    Typical CNN Structure:

    Input Image

        ↓

    Convolution Layer

        ↓

    ReLU Activation

        ↓

    Pooling Layer

        ↓

    Convolution

        ↓

    Pooling

        ↓

    Flatten

        ↓

    Fully Connected Layer

        ↓

    Output Layer (Softmax)


    Image Classification

    CNN is widely used for:

    • Cat vs Dog classification

    • Face recognition

    • Medical image detection

    • Object detection

    Real Example:

    If dataset = Cats & Dogs

    Final layer:

    • 1 neuron (Sigmoid) → Binary classification
      OR

    • 2 neurons (Softmax)

    Loss Function:

    • Binary Cross Entropy

    Example:

    Import Libraries

Importing TensorFlow and Keras Libraries for Deep Learning in Python

This Python code imports essential libraries used for building deep learning models. It includes TensorFlow and Keras for creating neural networks and Matplotlib for data visualization. These libraries provide tools for designing, training, and visualizing machine learning and deep learning models.

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import matplotlib.pyplot as plt
  • Load Dataset

Loading MNIST Dataset in Python using TensorFlow Keras

This Python example demonstrates how to load the MNIST handwritten digit dataset using TensorFlow Keras. The code imports the dataset, separates it into training and testing data, and prints the shape of both datasets to understand the number of images and their dimensions.

# Load MNIST dataset
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()

print("Training data shape:", x_train.shape)
print("Testing data shape:", x_test.shape)
  • Output:

    Training data shape: (60000, 28, 28)

    Testing data shape: (10000, 28, 28)



    Preprocessing

Data Preprocessing for CNN in Python (Normalization and Reshaping)

This Python example demonstrates how to preprocess image data before training a Convolutional Neural Network (CNN). The code normalizes pixel values from the range 0–255 to 0–1 and reshapes the dataset to a 4D format (samples, height, width, channels) required by CNN models. This step prepares the MNIST dataset for deep learning training.

CNN expects 4D input:
 (samples, height, width, channels)
# Normalize values (0–255 → 0–1)
x_train = x_train / 255.0
x_test = x_test / 255.0

# Reshape to add channel dimension
x_train = x_train.reshape(-1, 28, 28, 1)
x_test = x_test.reshape(-1, 28, 28, 1)

print("New shape:", x_train.shape)
  • Output:

    New shape: (60000, 28, 28, 1)


    Build CNN Model

Building a CNN Model in Python using TensorFlow Keras

This Python example demonstrates how to build a Convolutional Neural Network (CNN) using TensorFlow Keras for image classification. The model includes convolution layers for feature extraction, max pooling layers for dimensionality reduction, a flatten layer to convert data into a vector, and dense layers for classification. The final output layer uses the softmax activation function to classify images into 10 different classes.

model = keras.Sequential([
  
   # 1st Convolution Layer
   layers.Conv2D(32, (3,3), activation='relu', input_shape=(28,28,1)),
   layers.MaxPooling2D((2,2)),
  
   # 2nd Convolution Layer
   layers.Conv2D(64, (3,3), activation='relu'),
   layers.MaxPooling2D((2,2)),
  
   # Flatten
   layers.Flatten(),
  
   # Fully Connected Layer
   layers.Dense(64, activation='relu'),
  
   # Output Layer (10 classes)
   layers.Dense(10, activation='softmax')
])
  • Compile Model

Compiling a CNN Model in Python using TensorFlow Keras

This Python example demonstrates how to compile a Convolutional Neural Network (CNN) model using TensorFlow Keras. The model is configured with the Adam optimizer for efficient learning, sparse_categorical_crossentropy as the loss function for multi-class classification, and accuracy as the evaluation metric to measure the model’s performance during training.

model.compile(
   optimizer='adam',
   loss='sparse_categorical_crossentropy',
   metrics=['accuracy']
)
  • Train Model

Training a CNN Model in Python using TensorFlow Keras

This Python example demonstrates how to train a Convolutional Neural Network (CNN) using the model.fit() function in TensorFlow Keras. The model is trained on the training dataset for multiple epochs, while the testing dataset is used as validation data to monitor the model’s performance during training. The training history is stored to track metrics such as accuracy and loss.

history = model.fit(
   x_train, y_train,
   epochs=5,
   validation_data=(x_test, y_test)
)
  • Sample Training Output

    Epoch 1/5

    1875/1875 [==============================] - 5s - loss: 0.15 - accuracy: 0.95 - val_loss: 0.05 - val_accuracy: 0.98


    Epoch 2/5

    1875/1875 - loss: 0.04 - accuracy: 0.98 - val_loss: 0.04 - val_accuracy: 0.98


    Epoch 3/5

    1875/1875 - loss: 0.02 - accuracy: 0.99 - val_loss: 0.03 - val_accuracy: 0.99



    Final Accuracy ≈ 98–99%


    Evaluate Model

Evaluating a CNN Model in Python using TensorFlow Keras

This Python example demonstrates how to evaluate a trained Convolutional Neural Network (CNN) using TensorFlow Keras. The code tests the model on unseen data (x_test and y_test), calculates the loss and accuracy, and prints the test accuracy to measure the model’s performance on new images.

test_loss, test_acc = model.evaluate(x_test, y_test)
print("Test Accuracy:", test_acc)
  • Output:

    Test Accuracy: 0.9892


    CNN Architecture Summary (Used Above)

    Input (28x28x1)

    Conv2D (32 filters)

    MaxPooling

    Conv2D (64 filters)

    MaxPooling

    Flatten

    Dense (64)

    Output (10 neurons, Softmax)

    How CNN Worked Here

    1. First Conv Layer → Detect edges

    2. Second Conv Layer → Detect shapes

    3. Pooling → Reduce size

    4. Flatten → Convert to 1D

    5. Dense → Classification

    6. Softmax → Probability of 0–9

Next