Posts

Summing up my GSoC experience

Image
 Introduction Once again, I am Daniil Orel from Kazakhstan (Nur-Sultan / Astana city). This summer I was working with LibreHealth organisation for an amazing GSoC project. Few words about my host organisation LibreHealth is a  collaborative community, which supports free open source software projects in HealthTech. This organisation has multiple projects, but I was working towards the radiology project. My project Several medical procedures in surgery or interventional radiology are recorded as videos used for review, training, and quality monitoring. These videos have at least three interesting artifacts - anatomical structures such as organs, tumors, tissues, etc.; medical equipment, and medical information overlaid that describes the patient. It will be immensely helpful for review and search purposes if these can be identified and automatically labeled in the videos. My task was to develop ML models which can process videos and that are suitable for VR environment. As a result, I h

Video processing (Detection)

Image
 The last part from my work to show is a video-processing tool, where a detection model is used. The detection process Since I am using the roboflow library for predictions, my models is available by url. The issues is that the image can be sent as a file (roboflow processes it as a byte stream), that is why I have to save the image as "tmp.jpg" every time the predict function is called.  Fortunately, the quality of model is high. Here are some examples of images unseen by the model On the images above the bounding boxes are drawn correctly. Moreover, it can be seen that the classes are indicated correctly as well. High quality of the model allows to suggest using it for controlling the surgical tools during trainings.

Video processing improved

Image
 If you are reading this post, most probably, you have seen my last post about the video processing pipeline. Actually, this approach has few issues. Segmentation classes are hard do distinguish It is slow For the last two weeks I was working towards fixing these issues. Improving the processing speed To fix the processing speed I only had to adjust few parameters of OpenCV. Initially the FPS of resulting video did not match original (it was less than the original), but after I have fixed it, all processing went faster. Making segmentation easier to read I have adjusted the project so that now two modes are available. Now there are two possible modes: coloured segmentation and segmentation of a single class. Coloured segmentation Example of coloured segmentation On the image above it is easy to differentiate segmentation classes, but here is a single issue. You cannot identify classes if their indexes are not knows (so you have to read the documentation :D) Single class segmentation Th

Video Processing Pipeline

Image
 This week I was developing a pipeline to apply the pretrained models to a video. As a reference point, a library written by my ex-colleague was used. The pipeline is represented by two classes: VideoProcessor - which inherits from the OpenCV VideoCapture and allows applying several transformation functions to the video frame by frame ProcessedWriter - which uses the VideoProcessor and combines it with OpenCV VideoWriter Results of the pipeline are surprisingly good. For example, on the screens below you can see some frames from the processed video                                                                                                         

Inference for surgical tools detection

Detection.   This week I was writing the inference script for surgical tools detection model and creating a classification model (similar to the one used to solve the entrance exercise). The inference for Roboflow models is very simple, but I had to read the docs carefully in order not to invent the wheel. The issue for Roboflow inference is that it does not work with pathlib Path objects. This bug does not seem to be obvious, so I had to spend quite enough time to find it.  Also it turns out that the model is well optimised for runtime, so no additional transformation (to ONNX or TensorRT) were needed. Classification.   For classification I was using the same approach as for the entrance test. Notebook with experiments and the inference will be published soon.

Surgical tools detection

Image
 This week I was training the Surgical Tools Detection model. The dataset was taken from here . To train the model and prepare it for inference I was using Roboflow. Roboflow  is a robust web-platform for collecting datasets, training models and running them into production. Training I have split the data on 69/21/10 for train, val and test sets. Graphs with training metrics is shown below Train / val metrics To be more precise, on test set the model has 93.8% precision with 90% recall and 96.4% mAP.  It is considered as a high performance and means that the model can be used in production. Inference Roboflow itself provides multiple ways for inference. I could use web-API method (requests are sent to the server and the model sends back predictions), but to make it more flexible, I prefer python-package method. It can be easily imported to the main project of LibreHealth and run.

More experiments with bounding boxes and refactoring

Image
 This week I was mostly doing refactoring of my code and experiments. Refactoring First of all I made the segmentation inference pipeline more felxible by using inheritance for dataset classes and making sure that my onnx runtime works identically for both binary and multi class segmentation models. Then I have committed these changes. Experiments Secondly, this week I was trying to redistort bboxes using a method from OpenCV forum . The idea was simple: I have applied the distortion to a chessboard image, calculated the camera matrix and distortion matrix and then put them to the formula. Chessboard before distortion Chessboard after distortion Then I have applied a technique for camera calibration . And got the following representation: Distorted chessboard with corners After that I had distortion and camera matrix for my images. I have simply hardcoded them and applied to surgical tools dataset. The resulting distortion was not that strong, but coordinates of bounding boxes have cha