基于scSE非局部双流ResNet网络的行为识别

李占利; 王佳莹; 靳红梅; 李洪安

doi:10.3969/j.issn.1000-386x.2024.08.046

基于scSE非局部双流ResNet网络的行为识别

ACTION RECOGNITION ALGORITHM FOR NON-LOCAL TWO-STREAM RESNET NETWORK BASED ON SCSE FUSION

摘要

摘要: 针对双流网络对包含冗余信息的视频帧存在识别率低的问题，在双流网络的基础上引入scSE(Spatial and Channel Squeeze & Excitation Block)和非局部操作，构建SC_NLResNet行为识别框架。该框架将视频划分为等分不重叠的时序段并在每段上稀疏采样，提取RGB帧以及光流图作为scSE模块的输入；将经过scSE处理的特征输入非局部双流ResNet网络中，融合各分段得到最终的预测结果。在UCF101以及Hmdb51数据集上实验准确率分别达到96.9%和76.2%，结果表明，非局部操作与scSE模块结合可以增强特征时空上以及通道间的信息提高准确率，验证了SC_NLResNet网络的有效性。

Abstract: Aimed at the problem of low recognition rate of video frames containing redundant information in dual-stream network, scSE (Spatial and Channel Squeeze & Excitation Block) and non-local operation are introduced based on two-stream network to construct SC_NLResNet behavior recognition framework. In this framework, the framework divided the video into equal and non-overlapping temporal segments and sparsely sampled each segment, extracting RGB frames and optical flow graphs as the input of the scSE module. The features processed by scSE were inputted into the non-local two-stream ResNet network, and the segmentations were merged to obtain the final prediction results. The experimental accuracy on UCF101 and Hmdb51 dataset reaches 96.9 % and 76.2 %, respectively. The results show that the combination of non-local operation and scSE module can enhance the information of feature space-time and between the channels to improve the accuracy, which verifies the effectiveness of SC_NLResNet network.

HTML全文

参考文献(0)

施引文献

资源附件(0)