Regularization Techniques
- This lesson introduces regularization techniques used to prevent overfitting and improve neural network generalization.
L1 & L2 Regularization
Regularization works by adding a penalty term to the loss function.
New Loss:
Loss=Original Loss+Regularization TermLoss = Original\ Loss + Regularization\ TermLoss=Original Loss+Regularization Term
L1 Regularization (Lasso)
Adds absolute value of weights:
Loss=L+λ∑∣W∣Loss = L + \lambda \sum |W|Loss=L+λ∑∣W∣
Effect:
Forces some weights to become exactly 0
Performs feature selection
Creates sparse model
Useful when:
Many irrelevant features
Want simpler model
L2 Regularization (Ridge)
Adds squared weights:
Loss=L+λ∑W2Loss = L + \lambda \sum W^2Loss=L+λ∑W2
Effect:
Reduces weight magnitude
Does not make weights exactly zero
Smooth model
Most commonly used
L1 vs L2
Dropout
Dropout randomly turns off some neurons during training.
Example:
Dropout rate = 0.5
→ 50% neurons ignored randomly per batchWhy it works?
Prevents neurons from:
Becoming dependent on each other
Memorizing training data
Forces network to learn robust features.
During Testing:
All neurons are used (no dropout).
Code Example
Neural Network with Dropout Example in Python using TensorFlow Keras
This Python example demonstrates how to add Dropout regularization to a neural network using TensorFlow Keras. The model consists of a Dense hidden layer with ReLU activation, a Dropout layer to prevent overfitting by randomly deactivating 50% of neurons during training, and an output layer with softmax activation for multi-class classification.
from tensorflow.keras import layers
model = tf.keras.Sequential([
layers.Dense(128, activation='relu'),
layers.Dropout(0.5),
layers.Dense(10, activation='softmax')
])
Early Stopping
Instead of training for fixed epochs, stop training when:
Validation loss starts increasing.
Why?
When:
Training loss ↓
Validation loss ↑
It means overfitting has started.
Code Example
Using Early Stopping in TensorFlow Keras to Prevent Overfitting
This Python example demonstrates how to implement Early Stopping during neural network training using TensorFlow Keras. The EarlyStopping callback monitors the validation loss and stops training if it does not improve for a specified number of epochs (patience=3). This helps prevent overfitting and saves training time by stopping the model once it stops learning on validation data.
from tensorflow.keras.callbacks import EarlyStopping
early_stop = EarlyStopping(
monitor='val_loss',
patience=3
)
model.fit(X_train, y_train,
validation_data=(X_val, y_val),
callbacks=[early_stop])
Batch Normalization
Batch Normalization normalizes layer inputs.
Xnormalized=X−μσX_{normalized} = \frac{X - \mu}{\sigma}Xnormalized=σX−μ
Benefits:
Faster training
More stable gradients
Allows higher learning rate
Reduces internal covariate shiftCode Example
Using Batch Normalization in TensorFlow Keras Neural Networks
This Python snippet demonstrates how to apply Batch Normalization in a neural network using TensorFlow Keras. A Dense layer is followed by BatchNormalization and a separate ReLU activation layer. Batch Normalization normalizes the outputs of the previous layer, which helps stabilize and accelerate training while improving model performance.
layers.Dense(128),
layers.BatchNormalization(),
layers.Activation('relu')