• norsk
    • English
  • English 
    • norsk
    • English
  • Login
View Item 
  •   Home
  • Fakultet for teknologi, kunst og design (TKD)
  • TKD - Master Theses
  • TKD - Institutt for informasjonsteknologi - Masteroppgaver
  • View Item
  •   Home
  • Fakultet for teknologi, kunst og design (TKD)
  • TKD - Master Theses
  • TKD - Institutt for informasjonsteknologi - Masteroppgaver
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Less is More - Adapting the YOLOv8 Network for Multi-Spectral Human Presence Detection

Balla, Frencis
Master thesis
Thumbnail
View/Open
no.oslomet:inspera:232805187:126578107.pdf (26.65Mb)
URI
https://hdl.handle.net/11250/3162962
Date
2024
Metadata
Show full item record
Collections
  • TKD - Institutt for informasjonsteknologi - Masteroppgaver [82]
Abstract
Institutions involved with Search And Rescue (SAR) operations have found innovative

ways of enhancing the efficiency of their efforts by utilizing drones equipped with

thermal cameras to help better locate individuals in distress. One drawback of using

these methods is the need for a human operator for navigating and visually inspecting

the information provided from the drones to detect and pinpoint the presence of

humans. Based on the fact that current drone devices are capable of self-navigation,

we believe that the efficiency of these operations would be significantly enhanced

by automating the manual task of human presence detection. Several previous

studies have attempted to enhance object detection algorithms with the use of multi-

spectral imagery but usually their approaches have introduced significant additional

complexity to their baseline models rendering them to run slower. Based on recent

developments in the YOLO model family, we believe that better detection results

could be reached without hindering their computational efficiency. In this thesis,

we have extended the YOLOv8 model to support multi-spectral imagery without

introducing radical changes to its architecture. We have done so by adopting an

early feature fusion strategy for obtaining our multi-spectral input, changing the

kernel size of the first convolution block, and also upgrading the networks up-scaling

blocks yielding better small object detection capabilities which is highly relevant for

SAR operations. Additionally, we have converted three popular 4-channel datasets

to a format compatible with our proposed model. Several experiments have been

conducted to validate our results where the first one of them showed a more than

22% increase in the mAP50-95 metric from the baseline model. Additionally, we

have compared our solution with a previous work and our model managed to reach

a value of 0.656 in the mAP50-95 metric which is a more than 10% improvement

from the previous work. Finally we also tested the real-time capabilities of our

proposed solution and discovered that the proposed changes had only minor effects

on the inference speed. To encourage future work and give back to the community,

we have made the code base (https://github.com/frnc96/ms-yolov8), dataset’s

(https://huggingface.co/datasets/Frencis), and main models (https://huggingface.

co/Frencis) publicly available for anyone.
 
 
 
Publisher
Oslo Metropolitan University

Contact Us | Send Feedback

Privacy policy
DSpace software copyright © 2002-2019  DuraSpace

Service from  Unit
 

 

Browse

ArchiveCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsDocument TypesJournalsThis CollectionBy Issue DateAuthorsTitlesSubjectsDocument TypesJournals

My Account

Login

Statistics

View Usage Statistics

Contact Us | Send Feedback

Privacy policy
DSpace software copyright © 2002-2019  DuraSpace

Service from  Unit