Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

ML - Precision vs Recall

Overview

Precision and Recall are two critical metrics that often work in opposition. Understanding when to prioritize each is essential for building effective classification models.

Side-by-Side Comparison

AspectPrecisionRecall
DefinitionOf all predicted positive cases, how many are actually positive?Of all actual positive cases, how many did we catch?
FormulaPrecision=TPTP+FP\text{Precision} = \frac{TP}{TP + FP}Recall=TPTP+FN\text{Recall} = \frac{TP}{TP + FN}
FocusCorrectness of positive predictions (false alarm rate)Completeness of positive identification (miss rate)
Uses FP or FN?Penalizes False Positives (Type I errors)Penalizes False Negatives (Type II errors)
Range0 to 1 (or 0% to 100%)0 to 1 (or 0% to 100%)

Real-World Examples

Example 1: Spam Email Filter

Scenario: Your email provider has 1,000 emails to classify. The confusion matrix:

Predicted SpamPredicted Not Spam
Actually Spam80 (TP)20 (FN)
Actually Not Spam15 (FP)885 (TN)

Calculations:

Precision=8080+15=80950.842 or 84.2%\text{Precision} = \frac{80}{80 + 15} = \frac{80}{95} \approx 0.842 \text{ or } 84.2\%
Recall=8080+20=80100=0.80 or 80%\text{Recall} = \frac{80}{80 + 20} = \frac{80}{100} = 0.80 \text{ or } 80\%

Interpretation:

Trade-off Decision: In a spam filter, users prefer to see an occasional spam email (low recall) rather than miss important legitimate emails (low precision). Prioritize Precision.


Example 2: Medical Disease Detection (Cancer Screening)

Scenario: Testing 1,000 patients for cancer. The confusion matrix:

Predicted PositivePredicted Negative
Actually Positive90 (TP)10 (FN)
Actually Negative5 (FP)895 (TN)

Calculations:

Precision=9090+5=90950.947 or 94.7%\text{Precision} = \frac{90}{90 + 5} = \frac{90}{95} \approx 0.947 \text{ or } 94.7\%
Recall=9090+10=90100=0.90 or 90%\text{Recall} = \frac{90}{90 + 10} = \frac{90}{100} = 0.90 \text{ or } 90\%

Interpretation:

Trade-off Decision: In medical screening, missing a patient with cancer (low recall) is far worse than a false alarm (low precision). Prioritize Recall.


Precision-Recall Trade-off

As you adjust your model’s classification threshold, precision and recall move in opposite directions:

Higher Threshold (stricter about predictions)
         ↓
    Fewer positive predictions
         ↓
    Fewer False Positives → Higher Precision ✓
         ↓
    More False Negatives → Lower Recall ✗

Lower Threshold (more lenient about predictions)
         ↓
    More positive predictions
         ↓
    More False Positives → Lower Precision ✗
         ↓
    Fewer False Negatives → Higher Recall ✓

Which Metric to Choose?

Choose Precision When:

Choose Recall When:

Choose F1 Score When:

Using the Spam Email Example

Recall our spam filter with:

Analysis:

This is a good balance for a spam filter because:


Summary: Precision vs Recall Decision Matrix

ScenarioPriorityReason
Spam filterPrecisionUser frustration from missed legitimate emails > Spam in inbox
Cancer screeningRecallMissing cancer diagnosis > False alarm requiring further testing
Loan approvalPrecisionFinancial loss from bad loans > Rejecting good applications
Intrusion detectionRecallMissing a breach > Some false alarms on harmless traffic
Hiring recruiterPrecisionInterview time wasted on bad candidates > Missing good candidates
Defect detectionRecallDefective products reaching customers > Some false inspections