❮ Previous Next ❯

Ridge and Lasso Regression

This module explains Ridge and Lasso Regression techniques used to reduce overfitting in machine learning models through L1 and L2 regularization and understanding the bias-variance tradeoff.

Overfitting Problem
What is Overfitting?
Overfitting happens when a model:
- Performs very well on training data
- Performs poorly on testing/new data
It memorizes the data instead of learning general patterns.
Example
Suppose we are predicting house prices.
If the model:
- Uses too many features
- Fits noise in data
- Creates a very complex curve
Then it will perfectly fit training data but fail on new houses.
Signs of Overfitting
- Training accuracy = High
- Testing accuracy = Low
- Model too complex
Regularization Concept
Regularization is a technique used to:
- Reduce model complexity
- Prevent overfitting
- Penalize large coefficients
Idea:
Add a penalty term to the cost function.
Original Cost Function (MSE):
MSE=1n∑(y−y^)2MSE = \frac{1}{n} \sum (y - \hat{y})^2MSE=n1∑(y−y^)2
Regularized Cost Function:
Loss=MSE+PenaltyLoss = MSE + PenaltyLoss=MSE+Penalty
This penalty shrinks coefficient values.

L1 Regularization (Lasso Regression)
Full Form:
Least Absolute Shrinkage and Selection Operator
Formula:
Loss=MSE+λ∑∣w∣Loss = MSE + \lambda \sum |w|Loss=MSE+λ∑∣w∣
- λ\lambdaλ = regularization parameter
- www = coefficients
Key Feature:
- Shrinks coefficients
- Can make some coefficients exactly 0
- Performs feature selection
When to Use?
- When many irrelevant features exist
- When you want automatic feature selection
Python Example

Lasso Regression Model Training and Evaluation in Python

This code demonstrates how to train a Lasso Regression model using Python. It generates sample data, splits it into training and testing sets, fits the Lasso model, and prints the model coefficients along with training and testing scores to evaluate performance.

from sklearn.linear_model import Lasso
from sklearn.model_selection import train_test_split
import numpy as np

# Sample Data
X = np.random.rand(100, 5)
y = X @ np.array([5, 0, 3, 0, 2]) + np.random.randn(100)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

model = Lasso(alpha=0.1)
model.fit(X_train, y_train)

print("Coefficients:", model.coef_)
print("Training Score:", model.score(X_train, y_train))
print("Testing Score:", model.score(X_test, y_test))

Output:
Coefficients: [3.60521106 0. 1.07966069 0. 0.1975825 ]
Training Score: 0.5994195947062018
Testing Score: 0.4909136833444431

You will notice some coefficients become 0.

L2 Regularization (Ridge Regression)
Formula:
Loss=MSE+λ∑w2Loss = MSE + \lambda \sum w^2Loss=MSE+λ∑w2
Key Feature:
- Shrinks coefficients
- Does NOT make them zero
- Reduces impact of less important features
When to Use?
- When all features are important
- When multicollinearity exists
Python Example

Ridge Regression Model Training and Evaluation in Python

This code shows how to train a Ridge Regression model using Python. The model is fitted on training data and then used to calculate coefficients and evaluate performance using training and testing scores. Ridge Regression helps reduce overfitting by applying L2 regularization to the model.

from sklearn.linear_model import Ridge

model = Ridge(alpha=0.1)
model.fit(X_train, y_train)

print("Coefficients:", model.coef_)
print("Training Score:", model.score(X_train, y_train))
print("Testing Score:", model.score(X_test, y_test))

Output:
Coefficients: [4.61903088 0.26909111 2.27121045 0.51240214 1.45879947]
Training Score: 0.708650666355346
Testing Score: 0.5313172051848971

Bias-Variance Tradeoff
Understanding Ridge and Lasso requires understanding Bias & Variance.
Bias
Error due to overly simple model.
- Underfitting problem
- High bias → Model too simple
Example: Using straight line for complex curved data.
Variance
Error due to overly complex model.
- Overfitting problem
- High variance → Model too complex
Tradeoff
Model Type
Bias
Variance
Simple Model
High
Low
Complex Model
Low
High
Regularized Model
Balanced
Balanced
Goal:
Find balance between bias and variance.
Regularization:
- Slightly increases bias
- Reduces variance
- Improves generalization

❮ Previous Next ❯

Model Type	Bias	Variance
Simple Model	High	Low
Complex Model	Low	High
Regularized Model	Balanced	Balanced

Overfitting Problem

What is Overfitting?

Example

Signs of Overfitting

Regularization Concept

Idea:

L1 Regularization (Lasso Regression)

Full Form:

Formula:

Key Feature:

When to Use?

Python Example

Lasso Regression Model Training and Evaluation in Python

L2 Regularization (Ridge Regression)

Formula:

Key Feature:

When to Use?

Python Example

Ridge Regression Model Training and Evaluation in Python

Bias-Variance Tradeoff

Bias

Variance

Tradeoff

Goal:

Login

Create Account