Intro

Understanding the Diagnostic Challenge

Due to the COVID-19 outbreak, chest X-rays have become vital for fast diagnosis. This project builds a CNN model to classify four types of lung conditions from X-ray images. The goal is to support early detection and differentiation of respiratory diseases efficiently.

Understanding the Diagnostic Challenge

Due to the COVID-19 outbreak, chest X-rays have become vital for fast diagnosis. This project builds a CNN model to classify four types of lung conditions from X-ray images. The goal is to support early detection and differentiation of respiratory diseases efficiently.

Understanding the Diagnostic Challenge

Due to the COVID-19 outbreak, chest X-rays have become vital for fast diagnosis. This project builds a CNN model to classify four types of lung conditions from X-ray images. The goal is to support early detection and differentiation of respiratory diseases efficiently.

Dataset

  • 3 Classes: COVID-19, Normal, Viral Pneumonia

  • Size: 3616 COVID, 10,192 Normal, 1345 Viral Pneumonia images

Dataset

  • 3 Classes: COVID-19, Normal, Viral Pneumonia

  • Size: 3616 COVID, 10,192 Normal, 1345 Viral Pneumonia images

Dataset

  • 3 Classes: COVID-19, Normal, Viral Pneumonia

  • Size: 3616 COVID, 10,192 Normal, 1345 Viral Pneumonia images

Model Architecture (CNN)

Input (150x150x3)

Conv2D (32 filters) + MaxPooling

Conv2D (64 filters) + MaxPooling

Flatten

Dense (128) + ReLU

Dropout (0.5)

Dense (3) with Softmax

Model Architecture (CNN)

Input (150x150x3)

Conv2D (32 filters) + MaxPooling

Conv2D (64 filters) + MaxPooling

Flatten

Dense (128) + ReLU

Dropout (0.5)

Dense (3) with Softmax

Model Architecture (CNN)

Input (150x150x3)

Conv2D (32 filters) + MaxPooling

Conv2D (64 filters) + MaxPooling

Flatten

Dense (128) + ReLU

Dropout (0.5)

Dense (3) with Softmax

Improvements from Original Experiment

Aspect

Original

My Version

Dataset Handling

Raw zip used as-is

Cleaned & verified

Augmentation

May be absent

✅ Added

Overfitting Prevention

Not handled

✅ Dropout used

Evaluation

Accuracy only

✅ Accuracy + Loss + Graphs

Visualization

Minimal

✅ Informative plots

Improvements from Original Experiment

Aspect

Original

My Version

Dataset Handling

Raw zip used as-is

Cleaned & verified

Augmentation

May be absent

✅ Added

Overfitting Prevention

Not handled

✅ Dropout used

Evaluation

Accuracy only

✅ Accuracy + Loss + Graphs

Visualization

Minimal

✅ Informative plots

Improvements from Original Experiment

Aspect

Original

My Version

Dataset Handling

Raw zip used as-is

Cleaned & verified

Augmentation

May be absent

✅ Added

Overfitting Prevention

Not handled

✅ Dropout used

Evaluation

Accuracy only

✅ Accuracy + Loss + Graphs

Visualization

Minimal

✅ Informative plots

Procedure

Key Procedures

No

Step

Brief Description

1

Download Dataset

Use Kaggle API (kaggle.json) to download the COVID-19 Radiography dataset

2

Extract & Check Structure

Unzip the dataset and review folder vs metadata file structure

3

Remove Unused Files

Ignore .xlsx metadata files, keep only image folders (COVID, Normal, etc.)

4

Split into Train & Validation

Create train/ and val/ folders, split images randomly (e.g., 80:20 ratio)

5

Image Preprocessing

Resize images (e.g., 150x150), normalize (1./255), apply augmentation (optional)

6

Build CNN Model

Define a simple CNN: Conv2D → MaxPooling → Flatten → Dense → Dropout → Output

7

Compile Model

Use Adam optimizer, categorical crossentropy loss, and accuracy as metric

8

Train the Model

Train on the dataset for several epochs, monitor accuracy and loss

9

Evaluate Performance

Plot training/validation curves, generate confusion matrix and classification report




Key Procedures

No

Step

Brief Description

1

Download Dataset

Use Kaggle API (kaggle.json) to download the COVID-19 Radiography dataset

2

Extract & Check Structure

Unzip the dataset and review folder vs metadata file structure

3

Remove Unused Files

Ignore .xlsx metadata files, keep only image folders (COVID, Normal, etc.)

4

Split into Train & Validation

Create train/ and val/ folders, split images randomly (e.g., 80:20 ratio)

5

Image Preprocessing

Resize images (e.g., 150x150), normalize (1./255), apply augmentation (optional)

6

Build CNN Model

Define a simple CNN: Conv2D → MaxPooling → Flatten → Dense → Dropout → Output

7

Compile Model

Use Adam optimizer, categorical crossentropy loss, and accuracy as metric

8

Train the Model

Train on the dataset for several epochs, monitor accuracy and loss

9

Evaluate Performance

Plot training/validation curves, generate confusion matrix and classification report




Key Procedures

No

Step

Brief Description

1

Download Dataset

Use Kaggle API (kaggle.json) to download the COVID-19 Radiography dataset

2

Extract & Check Structure

Unzip the dataset and review folder vs metadata file structure

3

Remove Unused Files

Ignore .xlsx metadata files, keep only image folders (COVID, Normal, etc.)

4

Split into Train & Validation

Create train/ and val/ folders, split images randomly (e.g., 80:20 ratio)

5

Image Preprocessing

Resize images (e.g., 150x150), normalize (1./255), apply augmentation (optional)

6

Build CNN Model

Define a simple CNN: Conv2D → MaxPooling → Flatten → Dense → Dropout → Output

7

Compile Model

Use Adam optimizer, categorical crossentropy loss, and accuracy as metric

8

Train the Model

Train on the dataset for several epochs, monitor accuracy and loss

9

Evaluate Performance

Plot training/validation curves, generate confusion matrix and classification report




Diagram Flow

Diagram Flow

Diagram Flow

Tools

Conclusion

Model Performance Summary

This project demonstrates a reliable and reproducible CNN-based model for multiclass classification of chest X-ray images into four categories: COVID-19, Normal, Lung Opacity, and Viral Pneumonia. By applying proper data cleaning, augmentation, and training monitoring, the model was able to generalize well without overfitting.

The model achieved a peak validation accuracy of 86.94% at epoch 9. Final results include:

  • Training accuracy: 81.6%

  • Validation accuracy: 82.4%

  • Loss: consistently decreasing, indicating stable learning

These results show that even a relatively simple CNN architecture can yield strong performance when supported by good data practices.

Model Performance Summary

This project demonstrates a reliable and reproducible CNN-based model for multiclass classification of chest X-ray images into four categories: COVID-19, Normal, Lung Opacity, and Viral Pneumonia. By applying proper data cleaning, augmentation, and training monitoring, the model was able to generalize well without overfitting.

The model achieved a peak validation accuracy of 86.94% at epoch 9. Final results include:

  • Training accuracy: 81.6%

  • Validation accuracy: 82.4%

  • Loss: consistently decreasing, indicating stable learning

These results show that even a relatively simple CNN architecture can yield strong performance when supported by good data practices.

Model Performance Summary

This project demonstrates a reliable and reproducible CNN-based model for multiclass classification of chest X-ray images into four categories: COVID-19, Normal, Lung Opacity, and Viral Pneumonia. By applying proper data cleaning, augmentation, and training monitoring, the model was able to generalize well without overfitting.

The model achieved a peak validation accuracy of 86.94% at epoch 9. Final results include:

  • Training accuracy: 81.6%

  • Validation accuracy: 82.4%

  • Loss: consistently decreasing, indicating stable learning

These results show that even a relatively simple CNN architecture can yield strong performance when supported by good data practices.

Future Work

To further improve this research, the following directions can be explored:

  • Transfer Learning: Integrate more powerful architectures like EfficientNet, ResNet, or DenseNet for better accuracy and feature extraction.

  • Model Interpretability: Use Grad-CAM or SHAP to visualize which parts of the lungs the model focuses on when making decisions.

  • Class Imbalance Handling: Apply techniques such as focal loss or class weighting to balance the learning process across underrepresented classes.

  • Deployment: Convert the model to TensorFlow Lite or ONNX for real-time inference in mobile or clinical environments.

  • Broader Dataset: Include CT scans or datasets from different sources to enhance robustness and reduce bias.

Future Work

To further improve this research, the following directions can be explored:

  • Transfer Learning: Integrate more powerful architectures like EfficientNet, ResNet, or DenseNet for better accuracy and feature extraction.

  • Model Interpretability: Use Grad-CAM or SHAP to visualize which parts of the lungs the model focuses on when making decisions.

  • Class Imbalance Handling: Apply techniques such as focal loss or class weighting to balance the learning process across underrepresented classes.

  • Deployment: Convert the model to TensorFlow Lite or ONNX for real-time inference in mobile or clinical environments.

  • Broader Dataset: Include CT scans or datasets from different sources to enhance robustness and reduce bias.

Future Work

To further improve this research, the following directions can be explored:

  • Transfer Learning: Integrate more powerful architectures like EfficientNet, ResNet, or DenseNet for better accuracy and feature extraction.

  • Model Interpretability: Use Grad-CAM or SHAP to visualize which parts of the lungs the model focuses on when making decisions.

  • Class Imbalance Handling: Apply techniques such as focal loss or class weighting to balance the learning process across underrepresented classes.

  • Deployment: Convert the model to TensorFlow Lite or ONNX for real-time inference in mobile or clinical environments.

  • Broader Dataset: Include CT scans or datasets from different sources to enhance robustness and reduce bias.

CATEGORY

COVID-19

COVID-19

COVID-19

Classification

Classification

Classification

Convolutional Neural Network (CNN)

Convolutional Neural Network (CNN)

Convolutional Neural Network (CNN)

Machine Learning

Machine Learning

Machine Learning

DURATION

Apr 2024

Create a free website with Framer, the website builder loved by startups, designers and agencies.