Polynomial Regression

  • This module explains Polynomial Regression and how it models non-linear relationships using polynomial features while understanding the impact of model complexity.
  • Polynomial Regression

    Polynomial Regression is an extension of Linear Regression used when the relationship between variables is non-linear.

    Even though it models curves, it is still considered a linear model because it is linear in coefficients.


    Non-Linear Relationships

    What is a Non-Linear Relationship?

    When the relationship between X and Y is not a straight line.

    Example

    Suppose:

    • As study hours increase, marks increase rapidly at first

    • After some time, the increase slows down

    This creates a curve, not a straight line.

    Linear vs Polynomial

    Linear Equation:

    Y=mX+cY = mX + cY=mX+c

    Polynomial Equation (Degree 2):

    Y=aX2+bX+cY = aX^2 + bX + cY=aX2+bX+c

    Higher Degree Example:

    Y=aX3+bX2+cX+dY = aX^3 + bX^2 + cX + dY=aX3+bX2+cX+d


    Polynomial Features

    In Polynomial Regression, we transform original features into polynomial features.

    Example:

    If:

    X=2X = 2X=2

    Polynomial Features (degree = 3):

    [1,X,X2,X3]=[1,2,4,8][1, X, X^2, X^3] = [1, 2, 4, 8][1,X,X2,X3]=[1,2,4,8]

    So instead of just X, the model learns using:

    • X



    This allows it to fit curved data.


    Model Complexity

    Low Degree (Underfitting)

    Degree = 1 → Straight line
    May not capture complex patterns.

    High Degree (Overfitting)

    Degree = 10 → Very complex curve
    May fit noise instead of pattern.

    Goal

    Choose optimal degree to balance:

    • Bias

    • Variance

    Example: Study Hours vs Marks (Non-linear)

Polynomial Regression (Degree 2) Using Python

This code demonstrates how to perform Polynomial Regression using Python. It converts linear input data into polynomial features (degree 2), trains a Linear Regression model, makes predictions, and visualizes the non-linear relationship between study hours and marks using a graph.

# Step 1: Import Libraries
import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import PolynomialFeatures

# Step 2: Create Dataset
X = np.array([1, 2, 3, 4, 5]).reshape(-1, 1)
y = np.array([2, 6, 12, 20, 30])  # Non-linear pattern

# Step 3: Convert to Polynomial Features (degree = 2)
poly = PolynomialFeatures(degree=2)
X_poly = poly.fit_transform(X)

# Step 4: Train Model
model = LinearRegression()
model.fit(X_poly, y)

# Step 5: Predictions
y_pred = model.predict(X_poly)

# Step 6: Plot Graph
plt.scatter(X, y)            # Actual data
plt.plot(X, y_pred)          # Polynomial curve
plt.xlabel("Study Hours")
plt.ylabel("Marks")
plt.title("Polynomial Regression (Degree 2)")
plt.show()

# Step 7: Print Coefficients
print("Intercept:", model.intercept_)
print("Coefficients:", model.coef_)
  • Output:

    Intercept: 3.552713678800501e-15

    Coefficients: [0. 1. 1.]

Lesson image