杨彦辰, 云利军, 梅建华, 卢琳. 基于改进ViT的红外人体图像步态识别方法研究[J]. 应用光学, 2023, 44(1): 71-78. DOI: 10.5768/JAO202344.0102002
引用本文: 杨彦辰, 云利军, 梅建华, 卢琳. 基于改进ViT的红外人体图像步态识别方法研究[J]. 应用光学, 2023, 44(1): 71-78. DOI: 10.5768/JAO202344.0102002
YANG Yanchen, YUN Lijun, MEI Jianhua, LU Lin. Gait recognition method of infrared human body images based on improved ViT[J]. Journal of Applied Optics, 2023, 44(1): 71-78. DOI: 10.5768/JAO202344.0102002
Citation: YANG Yanchen, YUN Lijun, MEI Jianhua, LU Lin. Gait recognition method of infrared human body images based on improved ViT[J]. Journal of Applied Optics, 2023, 44(1): 71-78. DOI: 10.5768/JAO202344.0102002

基于改进ViT的红外人体图像步态识别方法研究

Gait recognition method of infrared human body images based on improved ViT

  • 摘要: 针对卷积神经网络在步态识别时准确率易饱和现象,以及Vision Transformer(ViT)对步态数据集拟合效率较低的问题,提出构建一个对称双重注意力机制模型,保留行走姿态的时间顺序,用若干独立特征子空间有针对性地拟合步态图像块;同时,采用对称架构的方式,增强注意力模块在拟合步态特征时的作用,并利用异类迁移学习进一步提升特征拟合效率。将该模型运用在中科院CASIA C红外人体步态库中进行多次仿真实验,平均识别准确率达到96.8%。结果表明,本文模型在稳定性、数据拟合速度以及识别准确率3方面皆优于传统ViT模型和CNN对比模型。

     

    Abstract: Aiming at the phenomenon that the accuracy of convolutional neural network is easy to be saturated in gait recognition and the problem of low fitting efficiency of vision transformer (ViT) to gait data set, an idea to construct a symmetrical dual attention mechanism model was proposed to retain the time order of walking posture, and fit the gait image blocks with several independent feature subspaces. At the same time, the symmetrical architecture was adopted to enhance the role of attention module in fitting gait features, and the heterogeneous transfer learning was used to further improve the efficiency of feature fitting. The model was applied to CASIA C infrared human body gait database of Chinese Academy of Sciences for many simulation experiments, and the average recognition accuracy was 96.8%. The results show that the proposed model is superior to the traditional ViT model and CNN comparison model in stability, data fitting speed and recognition accuracy.

     

/

返回文章
返回