Abstract:
In order to solve the problem of low detection accuracy caused by fixed receptive field in object detection while convolution only pays attention to conventional size targets and ignores the characteristics of small targets, an adaptive spatial attention mechanism is proposed. This method added parallel convolution kernels of different sizes and was embedded in the 3×3 convolution layer of Scaled-YOLOv4 residual structure, so that the network could adjust the receptive field size according to different sizes to enhance the feature extraction of small targets. The experimental results show that the new network model can effectively improve the detection accuracy of the algorithm for small targets, and improve the problems of false detection and missed detection in the original model. The detection accuracy on datasets such as MSCOCO and PASCAL VOC has been greatly improved.