Logistic Regression
- Logistic Regression is a popular classification algorithm used to predict the probability of binary outcomes. It uses the sigmoid function to map predicted values into probabilities.
Sigmoid Function
Logistic Regression uses the Sigmoid (Logistic) Function to convert output into probability.
Sigmoid Formula
σ(z)=11+e−z\sigma(z) = \frac{1}{1 + e^{-z}}σ(z)=1+e−z1
Where:
z=wx+bz = wx + bz=wx+b
What Sigmoid Does
Converts any real number into a value between 0 and 1
Output becomes probability
Example
If:
z=2z = 2z=2 σ(2)=0.88\sigma(2) = 0.88σ(2)=0.88
Meaning → 88% probability of class 1
Graph Behavior
S-shaped curve
Small input → output near 0
Large input → output near 1
Binary Classification
Binary Classification means predicting between two classes.
Examples:
Spam / Not Spam
Pass / Fail
Fraud / Not Fraud
Disease / No Disease
Example
Suppose we predict whether a student will pass:
Logistic Regression predicts:
P(Y=1∣X)P(Y=1|X)P(Y=1∣X)
If probability ≥ 0.5 → Class = 1
If probability < 0.5 → Class = 0Odds & Logit
This is the mathematical foundation of Logistic Regression.
Odds
Odds measure likelihood of an event happening.
Odds=P1−POdds = \frac{P}{1 - P}Odds=1−PP
Example:
If probability = 0.8
Odds=0.80.2=4Odds = \frac{0.8}{0.2} = 4Odds=0.20.8=4
Meaning → Event is 4 times more likely to happen.
Logit Function
Logit is the log of odds.
Logit=log(P1−P)Logit = \log \left(\frac{P}{1-P}\right)Logit=log(1−PP)
Logistic Regression actually models:
log(P1−P)=wx+b\log \left(\frac{P}{1-P}\right) = wx + blog(1−PP)=wx+b
So:
Linear relationship is between X and log-odds
Not directly between X and probability
Decision Boundary
The decision boundary separates the two classes.
For binary classification:
P=0.5P = 0.5P=0.5
At probability 0.5:
wx+b=0wx + b = 0wx+b=0
This line (or hyperplane) is called the decision boundary.
Example (2D Case)
If:
2x1+3x2−6=02x_1 + 3x_2 - 6 = 02x1+3x2−6=0
That equation forms the decision boundary.
Points on one side → Class 1
Points on other side → Class 0Example: Pass/Fail Prediction
Logistic Regression Example in Python for Pass/Fail Prediction
This Python example demonstrates how to use Logistic Regression with scikit-learn to predict whether a student will pass or fail based on study hours. The code creates a small dataset, trains a Logistic Regression model, calculates the probability of passing, and predicts the final class outcome. It also shows how machine learning models can make binary classification predictions.
# Step 1: Import Libraries
import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LogisticRegression
# Step 2: Create Dataset
X = np.array([1, 2, 3, 4, 5]).reshape(-1, 1)
y = np.array([0, 0, 0, 1, 1]) # 0 = Fail, 1 = Pass
# Step 3: Create Model
model = LogisticRegression()
# Step 4: Train Model
model.fit(X, y)
# Step 5: Predict Probability
prob = model.predict_proba([[3.5]])
print("Probability of Passing:", prob[0][1])
# Step 6: Predict Class
prediction = model.predict([[3.5]])
print("Predicted Class:", prediction[0])
Output:
Probability of Passing: 0.47913110199975184
Predicted Class: 0
Important Parameters
Logistic Regression vs Linear Regression