Advanced Performance Metrics
- Advanced performance metrics evaluate classification models using probability-based and threshold-based evaluation techniques.
Precision
Precision measures how many of the predicted positives are actually positive.
Precision=TPTP+FP\text{Precision} = \frac{TP}{TP + FP}Precision=TP+FPTP
Focuses on correctness of positive predictions
High Precision → Few false positives
Example: Spam detection
Of 100 predicted spam emails, only 80 are truly spam → Precision = 80%
Recall (Sensitivity / True Positive Rate)
Recall measures how many of the actual positives are correctly predicted.
Recall=TPTP+FN\text{Recall} = \frac{TP}{TP + FN}Recall=TP+FNTP
Focuses on capturing all actual positives
High Recall → Few false negatives
Example: Spam detection
Out of 100 actual spam emails, model detects 90 → Recall = 90%
F1 Score
F1 Score is the harmonic mean of Precision and Recall, balancing both.
F1=2×Precision×RecallPrecision+RecallF1 = 2 \times \frac{\text{Precision} \times \text{Recall}}{\text{Precision} + \text{Recall}}F1=2×Precision+RecallPrecision×Recall
High F1 → Good balance between Precision and Recall
Useful when both false positives and false negatives matter
Precision-Recall Tradeoff
Increasing Precision → Often reduces Recall
Increasing Recall → Often reduces Precision
Threshold tuning allows finding the right balance based on problem requirement
Example:
Spam email classifier:
High Precision → Fewer non-spam marked as spam (reduce false alarms)
High Recall → Catch most spam emails (even if some non-spam flagged)
Python Example
Precision, Recall, and F1 Score Calculation in Python for Model Evaluation
This Python example demonstrates how to evaluate a classification model using important metrics such as Precision, Recall, and F1 Score. The code compares actual labels and predicted labels, calculates each metric using scikit-learn, and prints the results. These metrics help measure the quality of predictions, especially in classification problems where class imbalance may occur.
from sklearn.metrics import precision_score, recall_score, f1_score
import numpy as np
# Step 1: Actual vs Predicted
y_true = np.array([1, 1, 0, 0, 1, 0, 0, 1, 0, 0])
y_pred = np.array([1, 0, 0, 0, 1, 0, 0, 1, 0, 0])
# Step 2: Calculate Precision
precision = precision_score(y_true, y_pred)
print("Precision:", precision)
# Step 3: Calculate Recall
recall = recall_score(y_true, y_pred)
print("Recall:", recall)
# Step 4: Calculate F1 Score
f1 = f1_score(y_true, y_pred)
print("F1 Score:", f1)
Output:
Precision: 1.0
Recall: 0.75
F1 Score: 0.8571428571428571