Convolutional Neural Networks (CNN)
- This lesson explains Convolutional Neural Networks and how they are used in image recognition and computer vision applications.
Why CNN?
Problem with Normal ANN for Images
If image size = 224 × 224 × 3
Total input neurons = 150,528
Fully connected layer =
Too many parameters
Overfitting
Very slow trainingWhy CNN is Better?
Uses local connections
Shares weights (same filter across image)
Detects spatial patterns
Less parameters
Best for image tasksCNN automatically learns:
Edges
Shapes
Textures
Objects
Convolution Operation
Convolution = Applying a small filter (kernel) over image.
Example:
Input Image (5×5)
Filter (3×3)Filter slides over image and performs:
Output=∑(Image×Filter)\text{Output} = \sum (Image \times Filter)Output=∑(Image×Filter)
This creates a feature map.
Intuition:
Filter detects specific patterns:
Vertical edge
Horizontal edge
Corners
Filters & Feature Maps
Filter (Kernel)
Small matrix like:
3×3
5×5
Each filter detects one feature.
Feature Map
Output generated after applying filter.
If we use:
10 filters → 10 feature maps
More filters = more learned features.
Padding & Stride
Stride
Stride = number of pixels filter moves.
Stride = 1 → moves 1 step
Stride = 2 → skips pixels
Higher stride → smaller output
Padding
Padding = adding zeros around image.
Types:
Valid Padding → No padding (output shrinks)
Same Padding → Add zeros to keep same size
Example:
Without padding:
5×5 → 3×3With padding:
5×5 → 5×5Pooling Layer (Downsampling)
Pooling reduces image size and computation.
Max Pooling
Takes maximum value in window.
Example (2×2):
1 3
4 2
Output = 4
Most commonly used.
Average Pooling
Takes average value.
1 3
4 2
Output = 2.5
Why Pooling?
Reduces computation
Reduces overfitting
Makes model translation invariantCNN Architecture
Typical CNN Structure:
Input Image
↓
Convolution Layer
↓
ReLU Activation
↓
Pooling Layer
↓
Convolution
↓
Pooling
↓
Flatten
↓
Fully Connected Layer
↓
Output Layer (Softmax)
Image Classification
CNN is widely used for:
Cat vs Dog classification
Face recognition
Medical image detection
Object detection
Real Example:
If dataset = Cats & Dogs
Final layer:
1 neuron (Sigmoid) → Binary classification
OR2 neurons (Softmax)
Loss Function:
Binary Cross Entropy
Example:
Import Libraries
Importing TensorFlow and Keras Libraries for Deep Learning in Python
This Python code imports essential libraries used for building deep learning models. It includes TensorFlow and Keras for creating neural networks and Matplotlib for data visualization. These libraries provide tools for designing, training, and visualizing machine learning and deep learning models.
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import matplotlib.pyplot as plt
Load Dataset
Loading MNIST Dataset in Python using TensorFlow Keras
This Python example demonstrates how to load the MNIST handwritten digit dataset using TensorFlow Keras. The code imports the dataset, separates it into training and testing data, and prints the shape of both datasets to understand the number of images and their dimensions.
# Load MNIST dataset
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()
print("Training data shape:", x_train.shape)
print("Testing data shape:", x_test.shape)
Output:
Training data shape: (60000, 28, 28)
Testing data shape: (10000, 28, 28)
Preprocessing
Data Preprocessing for CNN in Python (Normalization and Reshaping)
This Python example demonstrates how to preprocess image data before training a Convolutional Neural Network (CNN). The code normalizes pixel values from the range 0–255 to 0–1 and reshapes the dataset to a 4D format (samples, height, width, channels) required by CNN models. This step prepares the MNIST dataset for deep learning training.
CNN expects 4D input:
(samples, height, width, channels)
# Normalize values (0–255 → 0–1)
x_train = x_train / 255.0
x_test = x_test / 255.0
# Reshape to add channel dimension
x_train = x_train.reshape(-1, 28, 28, 1)
x_test = x_test.reshape(-1, 28, 28, 1)
print("New shape:", x_train.shape)
Output:
New shape: (60000, 28, 28, 1)
Build CNN Model
Building a CNN Model in Python using TensorFlow Keras
This Python example demonstrates how to build a Convolutional Neural Network (CNN) using TensorFlow Keras for image classification. The model includes convolution layers for feature extraction, max pooling layers for dimensionality reduction, a flatten layer to convert data into a vector, and dense layers for classification. The final output layer uses the softmax activation function to classify images into 10 different classes.
model = keras.Sequential([
# 1st Convolution Layer
layers.Conv2D(32, (3,3), activation='relu', input_shape=(28,28,1)),
layers.MaxPooling2D((2,2)),
# 2nd Convolution Layer
layers.Conv2D(64, (3,3), activation='relu'),
layers.MaxPooling2D((2,2)),
# Flatten
layers.Flatten(),
# Fully Connected Layer
layers.Dense(64, activation='relu'),
# Output Layer (10 classes)
layers.Dense(10, activation='softmax')
])
- Compile Model
Compiling a CNN Model in Python using TensorFlow Keras
This Python example demonstrates how to compile a Convolutional Neural Network (CNN) model using TensorFlow Keras. The model is configured with the Adam optimizer for efficient learning, sparse_categorical_crossentropy as the loss function for multi-class classification, and accuracy as the evaluation metric to measure the model’s performance during training.
model.compile(
optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy']
)
- Train Model
Training a CNN Model in Python using TensorFlow Keras
This Python example demonstrates how to train a Convolutional Neural Network (CNN) using the model.fit() function in TensorFlow Keras. The model is trained on the training dataset for multiple epochs, while the testing dataset is used as validation data to monitor the model’s performance during training. The training history is stored to track metrics such as accuracy and loss.
history = model.fit(
x_train, y_train,
epochs=5,
validation_data=(x_test, y_test)
)
Sample Training Output
Epoch 1/5
1875/1875 [==============================] - 5s - loss: 0.15 - accuracy: 0.95 - val_loss: 0.05 - val_accuracy: 0.98
Epoch 2/5
1875/1875 - loss: 0.04 - accuracy: 0.98 - val_loss: 0.04 - val_accuracy: 0.98
Epoch 3/5
1875/1875 - loss: 0.02 - accuracy: 0.99 - val_loss: 0.03 - val_accuracy: 0.99
Final Accuracy ≈ 98–99%
Evaluate Model
Evaluating a CNN Model in Python using TensorFlow Keras
This Python example demonstrates how to evaluate a trained Convolutional Neural Network (CNN) using TensorFlow Keras. The code tests the model on unseen data (x_test and y_test), calculates the loss and accuracy, and prints the test accuracy to measure the model’s performance on new images.
test_loss, test_acc = model.evaluate(x_test, y_test)
print("Test Accuracy:", test_acc)
Output:
Test Accuracy: 0.9892
CNN Architecture Summary (Used Above)
Input (28x28x1)
↓
Conv2D (32 filters)
↓
MaxPooling
↓
Conv2D (64 filters)
↓
MaxPooling
↓
Flatten
↓
Dense (64)
↓
Output (10 neurons, Softmax)
How CNN Worked Here
First Conv Layer → Detect edges
Second Conv Layer → Detect shapes
Pooling → Reduce size
Flatten → Convert to 1D
Dense → Classification
Softmax → Probability of 0–9