❮ Previous Next ❯

Hyperparameter Tuning

This lesson explains hyperparameter tuning methods used to optimize machine learning and deep learning models.

Learning Rate Scheduling
Learning Rate Scheduling = Changing learning rate during training.
Instead of using fixed learning rate, we adjust it over time.
Why?
If learning rate:
- Too high → unstable training
- Too low → slow convergence
Scheduling helps:
Faster convergence
Better final accuracy
Types of Learning Rate Scheduling
Step Decay
Reduce learning rate after fixed epochs.
Example:
0.01 → 0.001 → 0.0001
Exponential Decay
ηt=η0e−kt\eta_t = \eta_0 e^{-kt}ηt=η0e−kt
Learning rate decreases smoothly.
Reduce on Plateau
Reduce learning rate when validation loss stops improving.
Cosine Annealing
Learning rate decreases following cosine curve.
Code Example (TensorFlow)

Using ReduceLROnPlateau in TensorFlow Keras to Adjust Learning Rate

This Python example demonstrates how to implement ReduceLROnPlateau in TensorFlow Keras to dynamically adjust the learning rate during training. The callback monitors the validation loss and reduces the learning rate by a factor (here 0.5) if the loss does not improve for a specified number of epochs (patience=2). This technique helps improve convergence and can prevent the model from getting stuck in local minima.

from tensorflow.keras.callbacks import ReduceLROnPlateau

lr_scheduler = ReduceLROnPlateau(
    monitor='val_loss',
    factor=0.5,
    patience=2
)

model.fit(X_train, y_train,
          validation_data=(X_val, y_val),
          callbacks=[lr_scheduler])

Grid Search
Grid Search tries all possible combinations of hyperparameters.
Example:
Learning Rate
Batch Size
0.01
32
0.01
64
0.001
32
0.001
64
It tests all combinations.
Advantages:
Finds best combination (exhaustive)
Disadvantages:
Very slow
Expensive for deep learning

Random Search
Instead of trying all combinations, it selects random combinations.
Example:
Randomly try 20 configurations out of 100 possible.
Advantages:
Faster
Often finds good solution
More efficient than Grid Search
Research shows Random Search can be more effective than Grid Search when only few hyperparameters are important.

Model Selection
Model Selection = Choosing best model based on validation performance.
Steps:
Split dataset:
- Train
- Validation
- Test
Train multiple models
Compare metrics:
- Accuracy
- Precision
- Recall
- F1-score
Choose model with best validation performance
Cross-Validation
Instead of one validation split, use k-fold cross-validation.
Better evaluation but expensive for deep learning.

❮ Previous Next ❯

Learning Rate	Batch Size
0.01	32
0.01	64
0.001	32
0.001	64

Learning Rate Scheduling

Why?

Types of Learning Rate Scheduling

Step Decay

Exponential Decay

Reduce on Plateau

Cosine Annealing

Code Example (TensorFlow)

Using ReduceLROnPlateau in TensorFlow Keras to Adjust Learning Rate

Grid Search

Advantages:

Disadvantages:

Random Search

Advantages:

Model Selection

Cross-Validation

Login

Create Account