Deep learning for crop instance segmentation
Abstract
This thesis explores object detection with instance segmentation in relation to agriculture. For
the purpose of discovering a detection model that could potentially boost robotic greenhouse
harvesters with newer and improved detection accuracy.
The project set out to validate a RGBD dataset of sweet pepper crops, train three instance
segmentation models and compare the model performances. The RGBD dataset of sweet
pepper crops was found to have good quality RGB and annotation files but was missing pixel
values in the depth files. The models of Mask R-CNN, YOLACT and QueryInst was trained from
scratch and with pretrained weights. Tuning the learning rate was initiated to improve model
performance. The models were evaluated on the mean average precision (mAP) metric.
QueryInst failed to produce a mAP higher than zero. Mask R-CNN and YOLACT produced mAP
scores of 45% and 30.1% for mask predictions, and 42.4% and 33.3% for box predictions
respectively. Mask R-CNN had a slightly better mAP score in both cases. Visualizing the models
revealed that Mask R-CNN had several correct predictions, while YOLACT predicted fewer
correct and failed to recognize smaller instances. The project aimed to utilize RGBD data and
its depth values to produce results in 3D representation. This was realised with the depth
information.