Explainable AI for Healthcare

Three-Class COVID-19 Detection

Using CNN with Honest Grad-CAM Analysis

I investigate explainable AI for medical imaging — building CNNs and using Grad-CAM not as decoration, but as a diagnostic instrument to uncover what models actually learn.

What This Research Shows

Key Results

  • 95.27% test accuracy on 6,788 images
  • 0.95 macro F1 across 3 classes
  • Lightweight model: 456K params (1.74 MB)
  • COVID-19 false negative rate: 1.8%

Why It Matters

  • Three-class classification (not just binary) — more clinically realistic
  • Built on benchmark dataset: 33,958 images with standardized splits
  • Honest interpretability: Grad-CAM used to discover limitations, not just visualize
  • Deployable via TensorFlow.js — lightweight enough for browser inference

Evidence of Shortcut Learning

Grad-CAM revealed extra-pulmonary activation in some COVID-19 samples — attention appearing outside the lung region. This indicates the model may be using dataset artifacts rather than clinical features.

Referenced by DeGrave, Janizek & Lee (2021) in Nature Machine Intelligence. Rather than hiding this limitation, I treat it as a core contribution — honest interpretability as a diagnostic tool.

Performance

95.27%

Test Accuracy

0.95

Macro F1

456K

Model Params

1.8%

COVID False Neg

Per-Class Performance

COVID-19

F1: 0.99 · P: 0.99 · R: 0.98

Non-COVID

F1: 0.94 · P: 0.92 · R: 0.95

Normal

F1: 0.93 · P: 0.94 · R: 0.92

Why This Paper Matters

01

Three-Class, Not Binary

Most COVID-19 papers only do binary classification. This paper classifies COVID-19, Non-COVID pneumonia, and Normal — a more clinically realistic scenario.

02

Benchmark-Scale Dataset

Built on COVID-QU-Ex with standardized train/val/test split: 33,958 chest X-ray images. Train: 21,753 · Val: 5,417 · Test: 6,788.

03

Honest Interpretability

Grad-CAM as diagnostic tool, not decoration. This paper treats explainability as a way to discover model limitations — specifically, evidence of shortcut learning via extra-pulmonary activation.

covid-cxr-gradcam

Full implementation: model, training scripts, Grad-CAM analysis, figures, and classification report.

View on GitHub