❮ Previous

Next ❯

Convolutional Neural Networks (CNN)

This lesson explains Convolutional Neural Networks and how they are used in image recognition and computer vision applications.

Why CNN?
Problem with Normal ANN for Images
If image size = 224 × 224 × 3
Total input neurons = 150,528
Fully connected layer =
Too many parameters
Overfitting
Very slow training

Why CNN is Better?
Uses local connections
Shares weights (same filter across image)
Detects spatial patterns
Less parameters
Best for image tasks
CNN automatically learns:
- Edges
- Shapes
- Textures
- Objects
Convolution Operation
Convolution = Applying a small filter (kernel) over image.
Example:
Input Image (5×5)
Filter (3×3)
Filter slides over image and performs:
Output=∑(Image×Filter)\text{Output} = \sum (Image \times Filter)Output=∑(Image×Filter)
This creates a feature map.
Intuition:
Filter detects specific patterns:
- Vertical edge
- Horizontal edge
- Corners
Filters & Feature Maps
Filter (Kernel)
Small matrix like:
- 3×3
- 5×5
Each filter detects one feature.
Feature Map
Output generated after applying filter.
If we use:
- 10 filters → 10 feature maps
More filters = more learned features.

Padding & Stride
Stride
Stride = number of pixels filter moves.
- Stride = 1 → moves 1 step
- Stride = 2 → skips pixels
Higher stride → smaller output
Padding
Padding = adding zeros around image.
Types:
- Valid Padding → No padding (output shrinks)
- Same Padding → Add zeros to keep same size
Example:
Without padding:
5×5 → 3×3
With padding:
5×5 → 5×5

Pooling Layer (Downsampling)
Pooling reduces image size and computation.
Max Pooling
Takes maximum value in window.
Example (2×2):
1 3
4 2
Output = 4
Most commonly used.
Average Pooling
Takes average value.
1 3
4 2
Output = 2.5

Why Pooling?
Reduces computation
Reduces overfitting
Makes model translation invariant

CNN Architecture
Typical CNN Structure:
Input Image
    ↓
Convolution Layer
    ↓
ReLU Activation
    ↓
Pooling Layer
    ↓
Convolution
    ↓
Pooling
    ↓
Flatten
    ↓
Fully Connected Layer
    ↓
Output Layer (Softmax)

Image Classification
CNN is widely used for:
- Cat vs Dog classification
- Face recognition
- Medical image detection
- Object detection
Real Example:
If dataset = Cats & Dogs
Final layer:
- 1 neuron (Sigmoid) → Binary classification
  OR
- 2 neurons (Softmax)
Loss Function:
- Binary Cross Entropy
Example:
Import Libraries

Importing TensorFlow and Keras Libraries for Deep Learning in Python

This Python code imports essential libraries used for building deep learning models. It includes TensorFlow and Keras for creating neural networks and Matplotlib for data visualization. These libraries provide tools for designing, training, and visualizing machine learning and deep learning models.

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import matplotlib.pyplot as plt

Load Dataset

Loading MNIST Dataset in Python using TensorFlow Keras

This Python example demonstrates how to load the MNIST handwritten digit dataset using TensorFlow Keras. The code imports the dataset, separates it into training and testing data, and prints the shape of both datasets to understand the number of images and their dimensions.

# Load MNIST dataset
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()

print("Training data shape:", x_train.shape)
print("Testing data shape:", x_test.shape)

Output:
Training data shape: (60000, 28, 28)
Testing data shape: (10000, 28, 28)

Preprocessing

Data Preprocessing for CNN in Python (Normalization and Reshaping)

This Python example demonstrates how to preprocess image data before training a Convolutional Neural Network (CNN). The code normalizes pixel values from the range 0–255 to 0–1 and reshapes the dataset to a 4D format (samples, height, width, channels) required by CNN models. This step prepares the MNIST dataset for deep learning training.

CNN expects 4D input:
 (samples, height, width, channels)
# Normalize values (0–255 → 0–1)
x_train = x_train / 255.0
x_test = x_test / 255.0

# Reshape to add channel dimension
x_train = x_train.reshape(-1, 28, 28, 1)
x_test = x_test.reshape(-1, 28, 28, 1)

print("New shape:", x_train.shape)

Output:
New shape: (60000, 28, 28, 1)

Build CNN Model

Building a CNN Model in Python using TensorFlow Keras

This Python example demonstrates how to build a Convolutional Neural Network (CNN) using TensorFlow Keras for image classification. The model includes convolution layers for feature extraction, max pooling layers for dimensionality reduction, a flatten layer to convert data into a vector, and dense layers for classification. The final output layer uses the softmax activation function to classify images into 10 different classes.

model = keras.Sequential([
  
   # 1st Convolution Layer
   layers.Conv2D(32, (3,3), activation='relu', input_shape=(28,28,1)),
   layers.MaxPooling2D((2,2)),
  
   # 2nd Convolution Layer
   layers.Conv2D(64, (3,3), activation='relu'),
   layers.MaxPooling2D((2,2)),
  
   # Flatten
   layers.Flatten(),
  
   # Fully Connected Layer
   layers.Dense(64, activation='relu'),
  
   # Output Layer (10 classes)
   layers.Dense(10, activation='softmax')
])

Compile Model

Compiling a CNN Model in Python using TensorFlow Keras

This Python example demonstrates how to compile a Convolutional Neural Network (CNN) model using TensorFlow Keras. The model is configured with the Adam optimizer for efficient learning, sparse_categorical_crossentropy as the loss function for multi-class classification, and accuracy as the evaluation metric to measure the model’s performance during training.

model.compile(
   optimizer='adam',
   loss='sparse_categorical_crossentropy',
   metrics=['accuracy']
)

Train Model

Training a CNN Model in Python using TensorFlow Keras

This Python example demonstrates how to train a Convolutional Neural Network (CNN) using the model.fit() function in TensorFlow Keras. The model is trained on the training dataset for multiple epochs, while the testing dataset is used as validation data to monitor the model’s performance during training. The training history is stored to track metrics such as accuracy and loss.

history = model.fit(
   x_train, y_train,
   epochs=5,
   validation_data=(x_test, y_test)
)

Sample Training Output
Epoch 1/5
1875/1875 [==============================] - 5s - loss: 0.15 - accuracy: 0.95 - val_loss: 0.05 - val_accuracy: 0.98

Epoch 2/5
1875/1875 - loss: 0.04 - accuracy: 0.98 - val_loss: 0.04 - val_accuracy: 0.98

Epoch 3/5
1875/1875 - loss: 0.02 - accuracy: 0.99 - val_loss: 0.03 - val_accuracy: 0.99

Final Accuracy ≈ 98–99%

Evaluate Model

Evaluating a CNN Model in Python using TensorFlow Keras

This Python example demonstrates how to evaluate a trained Convolutional Neural Network (CNN) using TensorFlow Keras. The code tests the model on unseen data (x_test and y_test), calculates the loss and accuracy, and prints the test accuracy to measure the model’s performance on new images.

test_loss, test_acc = model.evaluate(x_test, y_test)
print("Test Accuracy:", test_acc)

Output:
Test Accuracy: 0.9892

CNN Architecture Summary (Used Above)
Input (28x28x1)
↓
Conv2D (32 filters)
↓
MaxPooling
↓
Conv2D (64 filters)
↓
MaxPooling
↓
Flatten
↓
Dense (64)
↓
Output (10 neurons, Softmax)
How CNN Worked Here
1. First Conv Layer → Detect edges
2. Second Conv Layer → Detect shapes
3. Pooling → Reduce size
4. Flatten → Convert to 1D
5. Dense → Classification
6. Softmax → Probability of 0–9

❮ Previous

Next ❯

Why CNN?

Problem with Normal ANN for Images

Why CNN is Better?

Convolution Operation

Example:

Intuition:

Filters & Feature Maps

Filter (Kernel)

Feature Map

Padding & Stride

Stride

Padding

Types:

Pooling Layer (Downsampling)

Max Pooling

Average Pooling

Why Pooling?