赵年甫, 王霖, 王向军, 陈文亮. 基于改进二阶段检测网络的长时跟踪重检测方法[J]. 应用光学, 2023, 44(4): 768-776. DOI: 10.5768/JAO202344.0402001
引用本文: 赵年甫, 王霖, 王向军, 陈文亮. 基于改进二阶段检测网络的长时跟踪重检测方法[J]. 应用光学, 2023, 44(4): 768-776. DOI: 10.5768/JAO202344.0402001
ZHAO Nianfu, WANG Lin, WANG Xiangjun, CHEN Wenliang. Re-detection method for long-term tracking based on improved two-stage detection networks[J]. Journal of Applied Optics, 2023, 44(4): 768-776. DOI: 10.5768/JAO202344.0402001
Citation: ZHAO Nianfu, WANG Lin, WANG Xiangjun, CHEN Wenliang. Re-detection method for long-term tracking based on improved two-stage detection networks[J]. Journal of Applied Optics, 2023, 44(4): 768-776. DOI: 10.5768/JAO202344.0402001

基于改进二阶段检测网络的长时跟踪重检测方法

Re-detection method for long-term tracking based on improved two-stage detection networks

  • 摘要: 为构建适用于长时跟踪的重检测模块,受改进二阶段检测网络的GlobalTrack方法的启发,提出了一种高效的对特定模板目标进行端到端重检测的深度网络:首先,为了在大尺度图像上更高效地融合模板特征,通过构造交叉信息增强模块改进深度互相关方法,利用交叉通道注意力信息编码搜索特征和模板特征;此外,采用动态实例交互模块替代传统二阶段网络的RPN(region proposal network)和RCNN(region-based convolutional neural networks)结构,根据模板信息指导检测网络的分类和回归阶段,构建了端到端的稀疏重检测结构。在LaSOT和OxUva长时跟踪数据集上进行对比实验,本文方法相较于原始方法性能提升3%,实时帧率提升173%。实验结果表明,改进后的方法可以在全图范围内更准确、快速地重新检测模板目标。

     

    Abstract: In order to build a re-detection module suitable for long-term tracking, inspired by the GlobalTrack method which improves two-stage detection network, an efficient deep network for end-to-end re-detection of specific template targets was proposed. First, for more efficient fusion of template features on large-scale images, the depth-wise correlation method was improved by constructing a cross-information enhancement module, which encoded the information of search and template features with cross channel-attention information. In addition, the region proposal network (RPN) and region-based convolutional neural networks (RCNN) structure of traditional two-stage detection network were replaced with a dynamic instance interaction module, guiding the classification-and-regression stage of the detection network with template information as well as building an end-to-end sparse re-detection structure. Comparing results on LaSOT and OxUva long-term tracking datasets, the performance of proposed method is improved by 3%, and the real-time frame rate is improved by 173% compared with those of the original method. The experimental results show that the improved method can re-detect template targets more accurately and quickly in the whole image range.

     

/

返回文章
返回