留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

热成像特征中期融合夜视密集人群计数

任国印 吕晓琪 李宇豪

任国印, 吕晓琪, 李宇豪. 热成像特征中期融合夜视密集人群计数[J]. 应用光学, 2022, 43(6): 1088-1096. doi: 10.5768/JAO202243.0604007
引用本文: 任国印, 吕晓琪, 李宇豪. 热成像特征中期融合夜视密集人群计数[J]. 应用光学, 2022, 43(6): 1088-1096. doi: 10.5768/JAO202243.0604007
REN Guoyin, LYU Xiaoqi, LI Yuhao. Night vision dense crowd counting based on mid-term fusion of thermal imaging features[J]. Journal of Applied Optics, 2022, 43(6): 1088-1096. doi: 10.5768/JAO202243.0604007
Citation: REN Guoyin, LYU Xiaoqi, LI Yuhao. Night vision dense crowd counting based on mid-term fusion of thermal imaging features[J]. Journal of Applied Optics, 2022, 43(6): 1088-1096. doi: 10.5768/JAO202243.0604007

热成像特征中期融合夜视密集人群计数

doi: 10.5768/JAO202243.0604007
基金项目: 国家自然科学基金(61771266,81571753);包头市青年创新人才项目(0701011904)
详细信息
    作者简介:

    任国印(1985—),男,博士研究生,讲师,主要从事深度学习与图像处理方面的研究。E-mail:renguoyin@imust.edu.cn

    通讯作者:

    吕晓琪(1963年),男,博士,教授,主要从事智能图像处理方面的研究。E-mail:lan_tian12345@hotmail.com

  • 中图分类号: TN223

Night vision dense crowd counting based on mid-term fusion of thermal imaging features

  • 摘要: 为了提高人群计数模型对尺度和光噪声的鲁棒性,设计了一种多模态图像融合网络。提出了一种针对夜间人群统计模型,并设计了一个子网络Rgb-T-net,网络融合了热成像特征和可见光图像的特征,增强了网络对热成像和夜间人群特征的判断能力。模型采用自适应高斯核对密度图进行回归,在Rgb-T-CC数据集上完成了夜视训练和测试。经验证网络平均绝对误差为18.16,均方误差为32.14,目标检测召回率为97.65%,计数性能和检测表现优于当前最先进的双峰融合方法。实验结果表明,所提出的多模态特征融合网络能够解决夜视环境下的计数与检测问题,消融实验进一步证明了融合模型各部分参数的有效性。
  • 图  1  Rgb-T-net热图融合感知器结构图

    Fig.  1  Structure diagram of Rgb-T-net heatmap fusion perceptron

    图  2  系统整体设计流程图

    Fig.  2  Flow chart of overall system design

    图  3  模型训练过程的曲线参数可视化

    Fig.  3  Visualization of curve parameters in model training process

    图  4  ShanghaiTechPartA,ShanghaiTechPartB,UCF-QNRF和UCF_CC_50数据集的可视化结果。从左到右:输入图像、真实标注、贝叶斯结果和我们推荐方法的结果

    Fig.  4  Visualization results from ShanghaiTechPartA, ShanghaiTechPartB, UCF-QNRF and UCF_CC_50 datasets (from left to right: input images, real annotations, Bayesian results and results from proposed method)

    图  5  热图数据集上的人群检测结果

    Fig.  5  Crowd detection results on heatmap datasets

    图  6  平均精度召回曲线对比

    Fig.  6  Comparison of average precision recall curves

    表  1  Rgb-T-CC数据集的参数信息

    Table  1  Parameter information for Rgb-T-CC dataset

    数据集分辨率数据类型数量最大最小平均总计模态
    Rgb-T-CC640×480Rgb+T4060824568138,389Rgb-T
    下载: 导出CSV

    表  2  Rgb-T-CC数据集上不同最新方法的比较

    Table  2  Comparison of different state-of-the-art methods on Rgb-T-CC dataset

    模型(Rgb-T-CC数据集)MAEMSE
    UCNet[18]33.9656.31
    HDFNet[19]22.3633.93
    BBSNet[20]19.5632.48
    MCNN[15]21.8937.44
    CSRNet[16]20.435.26
    Bayesian Loss[17]18.732.67
    MCNN+IADM[15]19.7730.34
    CSRNet+IADM[15]17.9430.91
    Bayesian Loss+IADM[15]15.6128.18
    本文Rgb-T-net18.1632.14
    下载: 导出CSV

    表  3  Rgb-T-CC数据集不同融合方式的比较

    Table  3  Comparison of different fusion methods on Rgb-T-CC dataset

    融合方式(Rgb-T-CC数据集)MAEMSE
    AGK22.4638.97
    AGK+Rgb-T-net(早期融合)18.0131.49
    AGK+Rgb-T-net(中期融合)18.1632.14
    AGK+Rgb-T-net(晚期融合)19.3534.71
    下载: 导出CSV
  • [1] XU Mingliang, GE Zhaoyang, JIANG Xiaoheng, et al. Depth Information Guided Crowd Counting for complex crowd scenes[J]. Pattern Recognition Letters,2019,12(5):563-569.
    [2] 高凯珺, 孙韶媛, 姚广顺, 等. 基于深度学习的无人车夜视图像语义分割[J]. 应用光学,2017,38(3):421-428.

    GAO Kaijun, SUN Shaoyuan, YAO Guangshun, et al. Semantic segmentation of night vision images for unmanned vehicles based on deep learning[J]. Journal of Applied Optics,2017,38(3):421-428.
    [3] 吴海兵, 陶声祥, 张良, 等. 低照度条件下三基色获取及真彩色融合方法研究[J]. 应用光学,2016,37(5):673-679.

    WU Haibing, TAO Shengxiang, ZHANG Liang, et al. Tricolor acquisition and true color images fusion method under low illumination condition[J]. Journal of Applied Optics,2016,37(5):673-679.
    [4] MIAO Yunqi, HAN Jungong, GAO Yongsheng, et al. ST-CNN: spatial-temporal convolutional neural network for crowd counting in videos[J]. Pattern Recognition Letters,2019,125(3):113-118.
    [5] LIU X, YANG J, DING W, et al. Adaptive mixture regression network with local counting map for crowd counting[C]//European Conference on Computer Vision, August 23-28, 2020, Glasgow. UK: Springer, 2020: 241-257.
    [6] ZHOU Yuan, YANG Jianxing, LI Hongru, et al. Adversarial learning for multiscale crowd counting under complex scenes[J]. IEEE Transactions on Cybernetics,2021,51(11):5423-5432. doi: 10.1109/TCYB.2019.2956091
    [7] BOOMINATHAN L, KRUTHIVENTI S S S, BABU R V. Crowdnet: a deep convolutional network for dense crowd counting[C]//Proceedings of the 24th ACM international conference on Multimedia, October 15-19, 2016, New York, NY. United States: ACM, 2016: 640-644.
    [8] SAMUEL M, SAMUEL-SOMA M A, MOVEH F F. Ai driven thermal people counting for smart window facade using portable low-cost miniature thermal imaging sensors[J]. 2020, 16(5): 1566-1574.
    [9] LIU D, ZHANG K, CHEN Z. Attentive cross-modal fusion network for RGB-D saliency detection[J]. IEEE Transactions on Multimedia,2020,23(1):967-981.
    [10] XU G, LI X, ZHANG X, et al. Loop closure detection in rgb-d slam by utilizing siamese convnet features[J]. Applied Sciences,2022,12(1):62-75.
    [11] TANG Z, XU T, LI H, et al. Exploring fusion strategies for accurate rgbt visual object tracking[J]. ArXiv Preprint ArXiv: 2201.08673, 2022.
    [12] ZHANG W, GUO X, WANG J, et al. Asymmetric adaptive fusion in a two-stream network for RGB-D human detection[J]. Sensors,2021,21(3):916-921. doi: 10.3390/s21030916
    [13] ZHOU Wujie, JIN Jianhui, LEI Jingsheng, et al. CEGFNet: common extraction and gate fusion network for scene parsing of remote sensing images[J]. IEEE Transactions on Geoscience and Remote Sensing,2021,19(6):1524-1535.
    [14] ZHANG Shihui, LI He, KONG Weihang. A cross-modal fusion based approach with scale-aware deep representation for RGB-D crowd counting and density estimation[J]. Expert Systems with Applications,2021,180(5):115071.
    [15] LIU Lingbo, CHEN Jiaqi, WU Hefeng, et al. Cross-modal collaborative representation learning and a large-scale rgbt benchmark for crowd counting[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 16-20, 2019, Long Beach, CA. USA: IEEE, 2021: 4823-4833.
    [16] LI Yuhong, ZHANG Xiaofan, CHEN Deming. Csrnet: Dilated convolutional neural networks for understanding the highly congested scenes[C]//Proceedings of the IEEE conference on computer vision and pattern recognition, June 18-22, 2018, Salt Lake City, Utah. USA: IEEE, 2018: 1091-1100.
    [17] FISCHER M, VIGNES A. An imprecise bayesian approach to thermal runaway probability[C]//International Symposium on Imprecise Probability: Theories and Applications, July 6-9, 2021, University of Granada, Granada. Spain: PMLR, 2021: 150-160.
    [18] LIU Lingbo, CHEN Jiaqi, WU Hefeng, et al. Cross-modal collaborative representation learning and a large-scale RGBT benchmark for crowd counting[C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). June 20-25, 2021. Nashville, TN. USA: IEEE, 2021: 4823-4833.
    [19] LIU Z, HE Z, WANG L, et al. Visdrone-cc2021: The vision meets drone crowd counting challenge results[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, October 11- October 17, 2021, Montreal, BC, Canada. USA: IEEE, 2021: 2830-2838.
    [20] ZHOU Wujie, GUO Qinling, LEI Jingsheng, et al. ECFFNet: effective and consistent feature fusion network for RGB-T salient object detection[J]. IEEE Transactions on Circuits and Systems for Video Technology,2022,32(3):1224-1235. doi: 10.1109/TCSVT.2021.3077058
    [21] FAN J, YANG X, LU R, et al. Design and implementation of intelligent inspection and alarm flight system for epidemic prevention[J]. Drones,2021,5(3):68-82. doi: 10.3390/drones5030068
  • 加载中
图(8) / 表(3)
计量
  • 文章访问数:  70
  • HTML全文浏览量:  68
  • PDF下载量:  5
  • 被引次数: 0
出版历程
  • 收稿日期:  2022-04-11
  • 修回日期:  2022-05-30
  • 网络出版日期:  2022-09-19
  • 刊出日期:  2022-11-14

目录

    /

    返回文章
    返回