Summing up my GSoC experience

 Introduction

Once again, I am Daniil Orel from Kazakhstan (Nur-Sultan / Astana city). This summer I was working with LibreHealth organisation for an amazing GSoC project.

Few words about my host organisation

LibreHealth is a collaborative community, which supports free open source software projects in HealthTech. This organisation has multiple projects, but I was working towards the radiology project.

My project

Several medical procedures in surgery or interventional radiology are recorded as videos used for review, training, and quality monitoring. These videos have at least three interesting artifacts - anatomical structures such as organs, tumors, tissues, etc.; medical equipment, and medical information overlaid that describes the patient. It will be immensely helpful for review and search purposes if these can be identified and automatically labeled in the videos.

My task was to develop ML models which can process videos and that are suitable for VR environment. As a result, I have developed ML models for classification, segmentation and detection of surgical tools in VR. Additionally I have developed an ML model for polyps and organs segmentation.

Work done

There are 4 major paths of my work.

Classification

Classification task was simple: given a crop with a surgical tool, I had to classify it.

Example of crops
To solve this problem I have simply trained an inception-V3 model for classification. It was trained on images that simulated VR environment (were curved) and the resulting performance of the model looks great:
Model AUC
Additionally I have built a pipeline for running this model in different circumstances:
  • Running classification model as a script from file
  • Running classification model from other packages
The first option can be used if there are is a folder with surgical tools crops and it is needed to classify them. It can be run using the following command from the classification folder:

python inference_scripy.py --img-dir="DIRECTORY WITH IMAGES" --save-to="OUTPUT DIRECTORY" --onnx-model-path="ONNX FILE" --seg-type="SEGMENTATION TYPE"


The second option is suitable for cases when a model should be called from another functions and/or classes. Such implementation makes the system more flexible.

More details are in this commit.

Segmentation

Segmentation task was split on two parts:
  1. Binary polyp segmentation
  2. Multi class segmentation (organs and surgical tools)
For binary segmentation the model was able to segment polyps with high accuracy:

Polyp segmentation examples

The quantitative result is IoU score of 0.93, which is perfect.

In case of multi class segmentation, results are great as well:
Multi class segmentation examples

For these models there are three use cases. First two are the same as for classification model. It can be found in this commit. The third use case is video segmentation. I have developed two classes:
  • One for processing video (applying segmentation to frames) in an online mode
  • Rewriting a video, after processing it (applying segmentation to all frames and writing them as another video)
The video processor class is defined here. Examples of how to use it for segmentation can be found in "examples" folder. Example from here generates the frames like the one below:
Screenshot 2022-09-05 at 10.43.06
Example of video segmentation

Detection

Detection task was performed for surgical tools. The model was trained using roboflow framework. Results of training are here:
Detection model training


Inference for this model is similar to the one for segmentation. The only difference is that this model requires Internet-connection, since my model is hosted on roboflow service. 

System as whole

A small system diagram

To explain how the whole system works, I can use a simple diagram. The figure above shows that there are 4 possible ML models. Each of them can be used as a backbone in the annotator pipeline. 

It means that any model can be used as a processing tool within the VideoProcessor class, while this class can be used inside Video Writer, to write the processed videos.


Examples of how to use it can be found here. More precisely, processing pipeline for segmentation is here , while the pipeline for detection is here. Examples for writing video files using detection and segmentation models are also available.

Merged pull requests and commits

My work and merged pull requests is here:

Conclusion

To sum up, I have successfully finished the GSoC. All required ML models were developed. Additionally I have developed the inference pipelines for 3 different use cases: use as a script, use as a function and video processing.

Comments

Popular posts from this blog

How can we "simulate" VR?

GSoC week 2: developing a model for semantic segmentation