基于自编码器结构与改进Bytetrack的低光照行人检测及跟踪算法

任泽林; 庞澜; 王超; 李嘉恒; 周方琰

doi:10.5768/JAO202445.0302001

基于自编码器结构与改进Bytetrack的低光照行人检测及跟踪算法

Low-light pedestrian detection and tracking algorithm based on autoencoder structure and improved Bytetrack

摘要

摘要: 针对夜间低光照场景下目标特征提取困难和跟踪不稳定的问题，提出了基于自编码器结构及改进Bytetrack的多目标行人检测及跟踪算法。在检测阶段，基于YOLOX（you only look once X）搭建多任务自编码变换模型框架，以一种自监督的方式考虑物理噪声模型和图像信号处理（image signal processing, ISP）的过程，通过对真实光照退化变换过程进行编码与解码学习内在视觉结构，并基于这种表示通过解码边界框坐标与类实现目标检测任务。为了抑制背景噪声的干扰，在目标解码器颈部网络引入自适应特征融合模块ASFF。跟踪阶段，基于Bytetrack算法进行改进，将基于Tranformer重识别网络提取到的外观嵌入信息与NSA卡尔曼滤波获得的运动信息通过自适应加权的方法完成数据关联，并通过Byte两次匹配的算法完成夜间行人的跟踪。在自建夜间行人检测数据集上测试检测模型的泛化能力，mAP@0.5达到了94.9%，结果表明本文的退化变换过程符合现实条件，具有良好的泛化能力。最后通过自建夜间行人跟踪数据集验证多目标跟踪性能，实验结果表明，本文提出的夜间低光照行人多目标跟踪算法MOTA（multiple object tracking accuracy）为89.55%，IDF1（identity F1 score）为88.34%，IDs（ID switches）为15。与基准方法Bytetrack相比，MOTA提高了10.72%，IDF1提高了6.19%，IDs减少了50%。结果表明，本文提出的基于自编码结构及改进Bytetrack的多目标跟踪算法可以有效解决在夜间低光照场景下行人跟踪困难的问题。

Abstract: Aiming at the problems of difficult target feature extraction and unstable tracking in low-light scenes at night, a multi-target pedestrian detection and tracking algorithm based on the autoencoder structure and improved Bytetrack was proposed. In the detection phase, a multi-task auto-encoding transformation model framework based on you only look once X (YOLOX) was built, considering the physical noise model and image signal processing (ISP) process in a self-supervised manner, learning the intrinsic features by encoding and decoding the real illumination degradation transformation process, and realizing the object detection tasks by decoding bounding box coordinates and classes based on this representation. In order to suppress the interference of background noise, the adaptive feature fusion module ASFF was introduced in the target decoder neck network. In the tracking phase, it was improved by Bytetrack algorithm, and the appearance embedded information extracted based on the Transformer re-identification network as well as the motion information obtained by the NSA Kalman filter was used to complete the data association through an adaptive weighting method, and the Byte twice matching algorithm was used to complete the tracking of pedestrian at night. The generalization ability of the detection model was tested on the self-built night pedestrian detection data set, and the mAP@0.5 reached 94.9%. The results showed that the proposed degradation transformation process met the realistic conditions and had good generalization ability. Finally, the multi-target tracking performance was verified through the night pedestrian tracking data set. The experimental results show that the proposed multiple object tracking accuracy (MOTA) of the night low-light pedestrian multi-target tracking algorithm is 89.55%, the identity F1 score (IDF1) is 88.34%, and the ID switches (IDs) are 15. Compared with the baseline method Bytetrack, the MOTA is improved by 10.72%, the IDF1 is improved by 6.19%, and the IDs are reduced by 50%. The results show that the proposed multi-target tracking algorithm based on the autoencoding structure and improved Bytetrack can effectively solve the problem of difficult pedestrian tracking in low-light scenes at night.

HTML全文

参考文献(22)

施引文献

资源附件(0)