吴雪平, 孙韶媛, 李佳豪, 李大威. 基于时空双流卷积神经网络的红外行为识别[J]. 应用光学, 2018, 39(5): 743-750. DOI: 10.5768/JAO201839.0506002
引用本文: 吴雪平, 孙韶媛, 李佳豪, 李大威. 基于时空双流卷积神经网络的红外行为识别[J]. 应用光学, 2018, 39(5): 743-750. DOI: 10.5768/JAO201839.0506002
Wu Xueping, Sun Shaoyuan, Li Jiahao, Li Dawei. Infrared behavior recognition based on spatio-temporal two-stream convolutional neural networks[J]. Journal of Applied Optics, 2018, 39(5): 743-750. DOI: 10.5768/JAO201839.0506002
Citation: Wu Xueping, Sun Shaoyuan, Li Jiahao, Li Dawei. Infrared behavior recognition based on spatio-temporal two-stream convolutional neural networks[J]. Journal of Applied Optics, 2018, 39(5): 743-750. DOI: 10.5768/JAO201839.0506002

基于时空双流卷积神经网络的红外行为识别

Infrared behavior recognition based on spatio-temporal two-stream convolutional neural networks

  • 摘要: 针对红外视频人体行为识别问题,提出了一种基于时空双流卷积神经网络的红外人体行为识别方法。通过将整个红外视频进行平均分段,然后将每一段视频中随机抽取的红外图像和对应的光流图像输入空间卷积神经网络,空间卷积神经网络通过融合光流信息可以有效地学习到红外图像中真正发生运动的空间信息,再将每一小段的识别结果进行融合得到空间网络结果。同时将每一段视频中随机抽取的光流图像序列输入时间卷积神经网络,融合每一小段的结果后得到时间网络结果。最后再将空间网络结果和时间网络结果进行加权求和,从而得到最终的视频分类结果。实验中,采用此方法对包含23种红外行为动作类别的红外视频数据集上的动作进行识别,正确识别率为92.0%。结果表明,该算法可以有效地对红外视频行为进行准确识别。

     

    Abstract: Aiming at the recognition of human behavior in infrared video, an infrared human behavior recognition method based on spatio-temporal two-flow convolutional neural network was proposed. In this method, first the entire infrared video is equally segmented, and then the infrared image extracted randomly and the corresponding optical flow image in each video segment are input into the spatial convolutional neural network, and the spatial network can effectively learn which part of the infrared image is actually the action by merging the optical flow information. Next the recognition results of each small segment are merged to get the spatial network results. At the same time, the randomly selected optical stream image sequence in each segment of the video is input into the temporal convolutional neural network, and the result of the temporal network can be obtained by fusing the result of each small segment. Finally, the results of spatial network and the temporal network are weighted and summed to obtain the final video classification results.In the experiment, the action on the infrared video data set containing 23 kinds of infrared behavior action categories was identified by this method, and the correct recognition rate was 92.0%. The results show that the algorithm can effectively identify the infrared video behavior.

     

/

返回文章
返回