login
Home / Papers / Efficient Small Object Detection You Only Look Once: A Small...

Efficient Small Object Detection You Only Look Once: A Small Object Detection Algorithm for Aerial Images

88 Citations•2024•
Jie Luo, Zhicheng Liu, Yibo Wang
Sensors (Basel, Switzerland)

An efficient small-object YOLO detection model (ESOD-YOLO) based on YOLOv8n for Unmanned Aerial Vehicle (UAV) object detection is proposed and achieves higher detection accuracy with lower parameters.

Abstract

Aerial images have distinct characteristics, such as varying target scales, complex backgrounds, severe occlusion, small targets, and dense distribution. As a result, object detection in aerial images faces challenges like difficulty in extracting small target information and poor integration of spatial and semantic data. Moreover, existing object detection algorithms have a large number of parameters, posing a challenge for deployment on drones with limited hardware resources. We propose an efficient small-object YOLO detection model (ESOD-YOLO) based on YOLOv8n for Unmanned Aerial Vehicle (UAV) object detection. Firstly, we propose that the Reparameterized Multi-scale Inverted Blocks (RepNIBMS) module is implemented to replace the C2f module of the Yolov8n backbone extraction network to enhance the information extraction capability of small objects. Secondly, a cross-level multi-scale feature fusion structure, wave feature pyramid network (WFPN), is designed to enhance the model’s capacity to integrate spatial and semantic information. Meanwhile, a small-object detection head is incorporated to augment the model’s ability to identify small objects. Finally, a tri-focal loss function is proposed to address the issue of imbalanced samples in aerial images in a straightforward and effective manner. In the VisDrone2019 test set, when the input size is uniformly 640 × 640 pixels, the parameters of ESOD-YOLO are 4.46 M, and the average mean accuracy of detection reaches 29.3%, which is 3.6% higher than the baseline method YOLOv8n. Compared with other detection methods, it also achieves higher detection accuracy with lower parameters.