Posts

Showing posts from June, 2022

GSoC week 2: developing a model for semantic segmentation

Image
 This week I was developing models for medical images segmentation in VR. Data I was working with two datasets: binary semantic segmentation dataset and multi class semantic segmentation dataset. The binary segmentation was done using  Polyp Segmentation in Colonoscopy data. This dataset contains 300 images with binary masks.  The multi class segmentation was done using Cholec8K dataset. This dataset contains 8080 laparoscopic cholecystectomy image frames  from 17 videos from Cholec80 dataset.  Model and training For both segmentation tasks I was using the DeepLabV3Plus  model with ResNet50 encoder, pretrained on imaginet data set.  For polyp segmentation I was using 80% of data for training and 20% for validation-testing. In case of Cholec8k I decided to make the train-test-split based on videos. 12 videos were taken for training and 5 for testing. For training I was using the open source  Catalyst  framework and SMP . It reduced the time required for building the training pipeline

How can we "simulate" VR?

     Having the proper data is important when it comes to developing an ML/DL application. In some cases, the amount of data is limited due to some reasons. For example, if we are trying to detect a rare disease, collecting 1KK samples could be simply impossible. When collecting a proper dataset is too expensive, we cannot afford it as well. What shall we do in such a case?     The answer is simple: we have to generate data. Generative Adversarial Networks  are said to be one of the ways to synthetically generate the data. However, it requires at least some data from the domain of interest to generate new data. In this case augmentations could help.  Data augmentation is the artificial creation of training data for machine learning by transformations (c) It means, that we can change the original data to increase the training sample and / or apply some transformations to another dataset to get something in the domain of our interest. The second technique can be used, for example, to add

Some info about the data I'm going to work with

Image
     Data is the main part of any DS/ML project. In this post I will discuss the data source for my project. Cholec 80      This dataset provides a large (80 Gb) amount of Laparoscopic videos with phase annotations. Additionally to phases, this dataset provides annotations to surgical tools used. In the rest of this paragraph I will be using the paper related to the dataset. More about the videos ( Twinanda et. al., 2016) : There are 80 videos (one surgery per video) There are 13 surgeons who carry out the surgery (could affect the data domain) There are 7 stages, namely: Preparation, Calot triangle dissection, Clipping and Cutting, Gallbladder dissection, Gallbladder packaging, Gallbladder retraction, Cleaning and coagulation.                                                                                                                 Figure 1      Initially the description sounds great, but.. if we read the paper (or check the figure 1), it is clear that there is a significant cla