I. Introduction
Unmanned aerial vehicle (UAV) is a well-known and convenient system for collecting pictures from an incredibly low elevation. In comparison to their smaller sizes, UAVs have a high level of flexibility and thus, provide aerial applications in various domains, such as fire detection, traffic congestion or accidents, search, and rescue operations according to user requirements. Owing to recent innovations in deep neural networks (DNNs), particularly, convolutional neural networks (CNNs), the performance of CNN-based detectors has increased dramatically [1], [2]. Most deep learning (DL) models not only defeat human-centered visual systems in occluded environments and accurate object detection but also surpass conventional algorithms in detection accuracy.