SIIM FISABIO RSNA COVID-19 DETECTION

vishal sharma
9 min readFeb 5, 2022

TABLE OF CONTENTS

  1. Introduction
  2. Business Problem
  3. DL Formulation
  4. Performance Metric
  5. Dataset Analysis
  6. Exploratory Data Analysis
  7. Data Preprocessing
  8. Modelling
  9. Test Result Comparison Of All Models
  10. Deployment of DL Model using Flask on Google Colab
  11. Future Work
  12. References

1.Introduction

This is the first of its type competition which is held by SIIM-FISABIO in its mission to advance medical imaging informatics through education research and innovation.

2. Business Problem

COVID-19 disease which is five times more deadly than the flu causes a pulmonary infection which results in inflammation and fluid in the lungs. Our job is to provide a computer vision model which will detect and localize COVID-19 patterns from chest radiographs. This would help the doctors in a quick and confident diagnosis. As a result, patients could get the right treatment before it gets too severe. Here we will be working with imaging data and annotations which is provided by a group of radiologists.

3. DL Formulation

This case study is about localizing and predicting the abnormalities which arise due to COVID-19 based upon the chest radiographs.

Actually, it is a classification and an object detection problem. Here by classification what I mean is given a chest radiograph we have to classify it into one of the 4 kinds of categories which are Negative, Typical, Indeterminate, and Atypical for Pneumonia. These categories are already predefined in the training dataset along with the bounding box. Now coming to the object detection task first we have to classify the opacity(along with the confidence score) and then find the bounding box coordinates. If there is no object then we have to output as none(along with the confidence score and the one-pixel bb).

4. Performance Metric

The performance metric used will be the mean average precision at IOU>0.5. The average precision is calculated by taking the area under the precision-recall curve. But this is not just simply taking the area its about interpolating the curve as it removes the wiggles which arise due to precision values and make the curve smooth. Then we actually divide the recall on the x axis into 11 equal parts and then calculate the average precision value.

Here pinter(r) is the maximum value of the precision for that recall which we will use while interpolating.

Above is the formula for calculating the average precision by using the 11 point interpolation method.

Now finally we will calculate the mean average precision by taking an average over all the classes present in the image.

5. Dataset Analysis

The training dataset contains about 6334 chest images in DICOM Format. The hidden test dataset is roughly of the same scale as the training dataset.

  • train_study_level.csv — the train study-level metadata, with one row for each study, including correct labels.
  • train_image_level.csv — the train image-level metadata, with one row for each image, including both correct labels and any bounding boxes in a dictionary format. Some images in both test and train have multiple bounding boxes.

train_study_level.csv

  • id - unique study identifier
  • Negative for Pneumonia - 1 if the study is negative for pneumonia, 0 otherwise
  • Typical Appearance - 1 if the study has this appearance, 0 otherwise
  • Indeterminate Appearance - 1 if the study has this appearance, 0 otherwise
  • Atypical Appearance - 1 if the study has this appearance, 0 otherwise

train_image_level.csv

  • id - unique image identifier
  • boxes - bounding boxes in easily-readable dictionary format
  • label - the correct prediction label for the provided bounding boxes

6. Exploratory Data Analysis

Lets visualize the study level and image level dataset first.

The study level indicates 6054 unique ids. It contains only a single label for each of the id.

Lets further investigate the proportion of each label present in study level dataset.

Now let’s start investigating at the image level

Here we will first see what are the count of bounding per image

Bar Plot Description

Here we can see that images containing 2 bounding boxes are the highest followed by images containing no bounding boxes and the least for images containing more than two bounding boxes.

Lets visualize the bounding boxes per image

First we will start with the images containing more than 2 bounding box.

Now I will plot for the images containing 2 bounding boxes

At last we will look for the images containing 1 bounding box

7. Data Preprocessing

Resizing the images was very important here as images were present in DICOM format. The original dataset size was of 128GB. By resizing it the size of the dataset was less than 1GB. Here we resized the images to 256*256. Below is the code for it

8. Modelling

First we will be implementing YOLO-v5 which helps us to provide the bounding boxes for the test images. Here we will be working on image-level for making the predictions.

First step is to merge the data frame at both study and test level so as to get a combined data frame along with the labels for the bounding boxes. Here at image level there are only two classes available which are either opacity or none.

Lets visualize our final data frame

By using stratified k fold we will divide the images into train and valid and will use YOLO-v5 algorithm for each of the folds. In this way we will come up with five different weights for each of the fold which we will combine it during inference.

Directory Structure

  1. Main_Directory_Name →Datset_folds — >Images — >Train/Val

Steps for training YOLO-v5

1) First we have to create the directory structure as mentioned above.

2) We have to clone the YOLO-v5 repository and move it along the same directory structure. Further we have to install the requires dependencies using the requirements.txt file.

3)Now we have to create a YAML file under the same directory structure.

After doing all the necessary steps for training YOLO-v5 our next step is towards creating the co-ordinates for the bounding boxes in YOLO-v5 format from given labels.

As we have resized the image into 256*256 we have to scale the labels accordingly. Here I have made sure that only those images are taken for which the labels are present.

Here we are defining a function which will transform the coordinates into YOLO-v5 format. On thing to keep in mind that the coordinates must be normalized.

The co-ordinates are xcenter, ycenter, width, height.

We also have to create labels for our train and valid images. This will be a text file which will contain the class label followed by bb co-ordinates. For each co-ordinate there will be a line change.

Directory Structure

Main_Directory_Name — ->Dataset_Folds — ->labels →train/val

Finally after doing all the necessary steps we will start training

Final Hyperparameters

Here we have already resized the image to 256*256

Now we will start doing classification at study level

We have tried four different model

  1. VGG-16
  2. VGG-19
  3. XCEPTION NETWORK
  4. DENSENET-121 NETWORK

Creation of train and test dataset is similar for all the four models. Lets see them in detail.

We will use transfer learning approach for doing multi class classification. We will split the dataset into train and test

For generating the train and valid dataset we will use flow from data-frame. This will automatically help to create augmentations within the dataset

Downloading the VGG-16 architecture and using Imagenet as pretrained weights. We will not train any of the layers of the VGG-16 network.

Adding layers on top of it for fine tuning our network

Lets start with the training process

Downloading the VGG-19 architecture and using Imagenet as pretrained weights. We will not train any of the layers of the VGG-19 network.

Adding layers on top of it for fine tuning our network

Lets start with the training process

Downloading the Xception architecture and using Imagenet as pretrained weights. We will not train any of the layers of the Xception network.

Adding layers on top of it for fine tuning our network

Lets start with the training process

Downloading the Densenet-121 architecture and using Imagenet as pretrained weights. We will not train any of the layers of the Densenet-121 network.

Adding layers on top of it for fine tuning our network

Lets start with the training process

9. Test Result Comparison Of All Models

Here YOLO-v5 with VGG-19 performed the best with private score of 0.334. As a first try we are able to acheive presentable results. Here there is definitely scope of improvement.

10. Deployment of DL Model using Flask on Google Colab

As the final step, I have deployed my model using flask on google Colab. One more important thing to note while deploying the model through Colab is to use Ngrok as it makes the IP public as Colab is a virtual machine. Here I am attaching the video link of my deployed model.

11. Future Work

Here we can try out by resizing the images to different sizes such as 324*324, 512*512, 684*684 with an ensembled approach by using YOLO-v5. Any other object detection algorithm can also be tried out such as Faster RCNN, SSD and other YOLO versions.

12. References

SIIM COVID-19: Convert to JPG 256px | Kaggle

SIIM-FISABIO-RSNA COVID-19 Detection | Kaggle

SIIM-FISABIO-RSNA COVID-19 Detection | Kaggle

--

--