Next

Logistic Regression

  • Logistic Regression is a popular classification algorithm used to predict the probability of binary outcomes. It uses the sigmoid function to map predicted values into probabilities.
  • Sigmoid Function

    Logistic Regression uses the Sigmoid (Logistic) Function to convert output into probability.

    Sigmoid Formula

    σ(z)=11+e−z\sigma(z) = \frac{1}{1 + e^{-z}}σ(z)=1+e−z1​

    Where:

    z=wx+bz = wx + bz=wx+b

    What Sigmoid Does

    • Converts any real number into a value between 0 and 1

    • Output becomes probability

    Example

    If:

    z=2z = 2z=2 σ(2)=0.88\sigma(2) = 0.88σ(2)=0.88

    Meaning → 88% probability of class 1

    Graph Behavior

    • S-shaped curve

    • Small input → output near 0

    • Large input → output near 1


    Binary Classification

    Binary Classification means predicting between two classes.

    Examples:

    • Spam / Not Spam

    • Pass / Fail

    • Fraud / Not Fraud

    • Disease / No Disease

    Example

    Suppose we predict whether a student will pass:

    Study Hours

    Result

    2

    Fail (0)

    5

    Pass (1)

    Logistic Regression predicts:

    P(Y=1∣X)P(Y=1|X)P(Y=1∣X)

    If probability ≥ 0.5 → Class = 1
    If probability < 0.5 → Class = 0


    Odds & Logit

    This is the mathematical foundation of Logistic Regression.

    Odds

    Odds measure likelihood of an event happening.

    Odds=P1−POdds = \frac{P}{1 - P}Odds=1−PP​

    Example:

    If probability = 0.8

    Odds=0.80.2=4Odds = \frac{0.8}{0.2} = 4Odds=0.20.8​=4

    Meaning → Event is 4 times more likely to happen.

    Logit Function

    Logit is the log of odds.

    Logit=log⁡(P1−P)Logit = \log \left(\frac{P}{1-P}\right)Logit=log(1−PP​)

    Logistic Regression actually models:

    log⁡(P1−P)=wx+b\log \left(\frac{P}{1-P}\right) = wx + blog(1−PP​)=wx+b

    So:

    • Linear relationship is between X and log-odds

    • Not directly between X and probability


    Decision Boundary

    The decision boundary separates the two classes.

    For binary classification:

    P=0.5P = 0.5P=0.5

    At probability 0.5:

    wx+b=0wx + b = 0wx+b=0

    This line (or hyperplane) is called the decision boundary.


    Example (2D Case)

    If:

    2x1+3x2−6=02x_1 + 3x_2 - 6 = 02x1​+3x2​−6=0

    That equation forms the decision boundary.

    Points on one side → Class 1
    Points on other side → Class 0


    Example: Pass/Fail Prediction

Logistic Regression Example in Python for Pass/Fail Prediction

This Python example demonstrates how to use Logistic Regression with scikit-learn to predict whether a student will pass or fail based on study hours. The code creates a small dataset, trains a Logistic Regression model, calculates the probability of passing, and predicts the final class outcome. It also shows how machine learning models can make binary classification predictions.

# Step 1: Import Libraries
import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LogisticRegression

# Step 2: Create Dataset
X = np.array([1, 2, 3, 4, 5]).reshape(-1, 1)
y = np.array([0, 0, 0, 1, 1])  # 0 = Fail, 1 = Pass

# Step 3: Create Model
model = LogisticRegression()

# Step 4: Train Model
model.fit(X, y)

# Step 5: Predict Probability
prob = model.predict_proba([[3.5]])
print("Probability of Passing:", prob[0][1])

# Step 6: Predict Class
prediction = model.predict([[3.5]])
print("Predicted Class:", prediction[0])
  • Output:

    Probability of Passing: 0.47913110199975184

    Predicted Class: 0

    Important Parameters

    Parameter

    Meaning

    penalty

    Regularization type

    C

    Regularization strength

    solver

    Optimization method


    Logistic Regression vs Linear Regression

    Linear Regression

    Logistic Regression

    Predicts continuous value

    Predicts probability

    Output range (-∞, +∞)

    Output range (0,1)

    Uses MSE

    Uses Log Loss

Next