Hyperparameter Tuning
- This lesson explains hyperparameter tuning methods used to optimize machine learning and deep learning models.
Learning Rate Scheduling
Learning Rate Scheduling = Changing learning rate during training.
Instead of using fixed learning rate, we adjust it over time.
Why?
If learning rate:
Too high → unstable training
Too low → slow convergence
Scheduling helps:
Faster convergence
Better final accuracyTypes of Learning Rate Scheduling
Step Decay
Reduce learning rate after fixed epochs.
Example:
0.01 → 0.001 → 0.0001Exponential Decay
ηt=η0e−kt\eta_t = \eta_0 e^{-kt}ηt=η0e−kt
Learning rate decreases smoothly.
Reduce on Plateau
Reduce learning rate when validation loss stops improving.
Cosine Annealing
Learning rate decreases following cosine curve.
Code Example (TensorFlow)
Using ReduceLROnPlateau in TensorFlow Keras to Adjust Learning Rate
This Python example demonstrates how to implement ReduceLROnPlateau in TensorFlow Keras to dynamically adjust the learning rate during training. The callback monitors the validation loss and reduces the learning rate by a factor (here 0.5) if the loss does not improve for a specified number of epochs (patience=2). This technique helps improve convergence and can prevent the model from getting stuck in local minima.
from tensorflow.keras.callbacks import ReduceLROnPlateau
lr_scheduler = ReduceLROnPlateau(
monitor='val_loss',
factor=0.5,
patience=2
)
model.fit(X_train, y_train,
validation_data=(X_val, y_val),
callbacks=[lr_scheduler])
Grid Search
Grid Search tries all possible combinations of hyperparameters.
Example:
It tests all combinations.
Advantages:
Finds best combination (exhaustive)
Disadvantages:
Very slow
Expensive for deep learningRandom Search
Instead of trying all combinations, it selects random combinations.
Example:
Randomly try 20 configurations out of 100 possible.Advantages:
Faster
Often finds good solution
More efficient than Grid SearchResearch shows Random Search can be more effective than Grid Search when only few hyperparameters are important.
Model Selection
Model Selection = Choosing best model based on validation performance.
Steps:
Split dataset:
Train
Validation
Test
Train multiple models
Compare metrics:
Accuracy
Precision
Recall
F1-score
Choose model with best validation performance
Cross-Validation
Instead of one validation split, use k-fold cross-validation.
Better evaluation but expensive for deep learning.