Abstract:
SSD, a one-stage detector based on deep learning with simple architecture of feature pyramid network (FPN), has limitation of feature extraction due to lack of cross-scale information fusion which causes poor performance on detecting small and medium objects. To alleviate the problem, this paper proposes AD-SSD (Attention&DSC Single Shot Multi-Box Detector). AD-SSD introduced fast normalized fusion to FPN, and made use of self-attention mechanism to enhance feature information of objects at different scales. Depth-wise separable convolution was used to reduce model parameters. This method not only improved the mean average precision (mAP) of SSD but also accelerated the detection speed. Results on PASCAL VOC07+12 dataset show that it achieves 81.7% mAP at speed of 55.1 FPS while mAP for small and medium object are improved by 6.3% and 6.4% respectively.