Abstract:
To achieve intelligent management of new energy vehicle teaching sites and enhance the safety management capabilities of these sites, an improved YOLOv8 object recognition algorithm is proposed to address the issue of student occlusion at new energy vehicle sites. By constructing a deep over-parameterized efficient aggregation network, different convolution kernels were utilized to calculate each channel, thereby enhancing the feature extraction ability of the unobscured parts of the target. To enhance the recognition of target occlusion, the SwiftFormerEncoder, a multi-layer self-attention mechanism, was used to replace the Bottleneck in the C2f module of YOLOv8, thereby enhancing the network’s feature extraction capability. To enhance the speed of feature information recognition in the video, channel pruning based on the BN layer was conducted in the model convolution, and the UloU was used to improve the loss function, reducing the model’s memory usage and improving the recognition accuracy in dense scenes. The improved algorithm was trained and validated on UnrealSynth (a synthetic data generator for Unreal) and tested on the self-owned datasets of the practical teaching base respectively. The test results show that the mAP of the improved model reached 0.955. Testing results in the video fusion recognition system indicate that this model not only increases recognition speed but also effectively improves recognition accuracy, offering superior overall performance.