Abstract:
Aiming at the problems of difficult target feature extraction and unstable tracking in low-light scenes at night, a multi-target pedestrian detection and tracking algorithm based on the autoencoder structure and improved Bytetrack was proposed. In the detection phase, a multi-task auto-encoding transformation model framework based on you only look once X (YOLOX) was built, considering the physical noise model and image signal processing (ISP) process in a self-supervised manner, learning the intrinsic features by encoding and decoding the real illumination degradation transformation process, and realizing the object detection tasks by decoding bounding box coordinates and classes based on this representation. In order to suppress the interference of background noise, the adaptive feature fusion module ASFF was introduced in the target decoder neck network. In the tracking phase, it was improved by Bytetrack algorithm, and the appearance embedded information extracted based on the Transformer re-identification network as well as the motion information obtained by the NSA Kalman filter was used to complete the data association through an adaptive weighting method, and the Byte twice matching algorithm was used to complete the tracking of pedestrian at night. The generalization ability of the detection model was tested on the self-built night pedestrian detection data set, and the mAP@0.5 reached 94.9%. The results showed that the proposed degradation transformation process met the realistic conditions and had good generalization ability. Finally, the multi-target tracking performance was verified through the night pedestrian tracking data set. The experimental results show that the proposed multiple object tracking accuracy (MOTA) of the night low-light pedestrian multi-target tracking algorithm is 89.55%, the identity F1 score (IDF1) is 88.34%, and the ID switches (IDs) are 15. Compared with the baseline method Bytetrack, the MOTA is improved by 10.72%, the IDF1 is improved by 6.19%, and the IDs are reduced by 50%. The results show that the proposed multi-target tracking algorithm based on the autoencoding structure and improved Bytetrack can effectively solve the problem of difficult pedestrian tracking in low-light scenes at night.