基于自编码器结构与改进Bytetrack的低光照行人检测及跟踪算法

任泽林; 庞澜; 王超; 李嘉恒; 周方琰

基于自编码器结构与改进Bytetrack的低光照行人检测及跟踪算法

Low-light pedestrian detection and tracking algorithm based on autoencoder structure and improved Bytetrack

摘要

摘要: 针对夜间低光照场景下目标特征提取困难和跟踪不稳定的问题，提出了基于自编码器结构及改进Bytetrack的多目标行人检测及跟踪算法。在检测阶段，基于YOLOX(you only look once X) 搭建多任务自编码变换模型框架，以一种自监督的方式考虑物理噪声模型和图像信号处理（image signal processing,ISP）的过程，通过对真实光照退化变换过程进行编码与解码学习内在视觉结构，并基于这种表示通过解码边界框坐标与类实现目标检测任务。为了抑制背景噪声的干扰，在目标解码器颈部网络引入自适应特征融合模块ASFF。跟踪阶段，基于Bytetrack算法进行改进，将基于Tranformer重识别网络提取到的外观嵌入信息与NSA卡尔曼滤波获得的运动信息通过自适应加权的方法完成数据关联，并通过Byte两次匹配的算法完成夜间行人的跟踪。在自建夜间行人检测数据集上测试检测模型的泛化能力，mAP@0.5达到了94.9%，结果表明本文的退化变换过程符合现实条件，具有良好的泛化能力。最后通过自建夜间行人跟踪数据集验证多目标跟踪性能，实验结果表明，本文提出的夜间低光照行人多目标跟踪算法MOTA(multiple object tracking accuracy)为89.55%，IDF1为88.34%，IDs为15。与基准方法Bytetrack相比，MOTA提高了10.72%，IDF1提高了6.19%，IDs减少了50%。结果表明，本文提出的基于自编码结构及改进Bytetrack的多目标跟踪算法可以有效解决在夜间低光照场景下行人跟踪困难的问题。

Abstract: The extensive application of deep learning technology in the field of computer vision has promoted the rapid development of technologies such as intelligent security, intelligent transportation, and autonomous driving. Among them, the automatic detection and tracking of pedestrians is the prerequisite for realizing intelligence and automation. But nowadays, accurate detection and tracking of pedestrians in low-light scenes at night still faces great difficulties. In order to solve the problems of difficult target feature extraction and unstable tracking in low-light scenes at night, this paper proposes a multi-target pedestrian detection and tracking algorithm based on the autoencoder structure and improved Bytetrack. In the detection phase, this paper builds a multi-task auto-encoding transformation model framework based on YOLOX, considers the physical noise model and image signal processing (ISP) process in a self-supervised manner, and learns the intrinsic features by encoding and decoding the real illumination degradation transformation process. visual structure, and implement object detection tasks by decoding bounding box coordinates and classes based on this representation. In order to suppress the interference of background noise, the adaptive feature fusion module ASFF is introduced in the target decoder neck network. In the tracking phase, the Bytetrack algorithm is improved, and the appearance embedded information extracted based on the Transformer re-identification network and the motion information obtained by the NSA Kalman filter are used to complete the data association through an adaptive weighting method, and the Byte twice matching algorithm is used to complete the data association at night. Pedestrian tracking. The generalization ability of the detection model was tested on the self-built night pedestrian detection data set, and mAP@0.5 reached 94.9%. The results show that the degradation transformation process in this paper meets the realistic conditions and has good generalization ability. Finally, the multi-target tracking performance is verified through the night pedestrian tracking data set. The experimental results show that the MOTA of the night low-light pedestrian multi-target tracking algorithm proposed in this paper is 89.55%, IDF1 is 88.34%, and IDs are 15. Compared with the baseline method Bytetrack, MOTA improves by 10.72%, IDF1 improves by 6.19%, and IDs are reduced by 50%. The results show that the multi-target tracking algorithm based on the autoencoding structure and improved Bytetrack proposed in this article can effectively solve the problem of difficult pedestrian tracking in low-light scenes at night.

HTML全文

参考文献(23)

施引文献

资源附件(0)