基于改进RoI Transformer的遥感图像多尺度旋转目标检测

刘敏豪, 王堃, 金睿蛟, 卢天, 李璋

刘敏豪, 王堃, 金睿蛟, 卢天, 李璋. 基于改进RoI Transformer的遥感图像多尺度旋转目标检测[J]. 应用光学, 2023, 44(5): 1010-1021. DOI: 10.5768/JAO202344.0502001
引用本文: 刘敏豪, 王堃, 金睿蛟, 卢天, 李璋. 基于改进RoI Transformer的遥感图像多尺度旋转目标检测[J]. 应用光学, 2023, 44(5): 1010-1021. DOI: 10.5768/JAO202344.0502001
LIU Minhao, WANG Kun, JIN Ruijiao, LU Tian, LI Zhang. Multi-scale oriented object detection based on improved RoI Transformer in remote sensing images[J]. Journal of Applied Optics, 2023, 44(5): 1010-1021. DOI: 10.5768/JAO202344.0502001
Citation: LIU Minhao, WANG Kun, JIN Ruijiao, LU Tian, LI Zhang. Multi-scale oriented object detection based on improved RoI Transformer in remote sensing images[J]. Journal of Applied Optics, 2023, 44(5): 1010-1021. DOI: 10.5768/JAO202344.0502001

基于改进RoI Transformer的遥感图像多尺度旋转目标检测

基金项目: 国家自然科学基金(61801491)
详细信息
    作者简介:

    刘敏豪(1999—),女,硕士研究生,主要从事深度学习与图像处理、旋转目标检测研究。E-mail:lmh313@nudt.edu.com

    通讯作者:

    李璋(1985—),男,博士,研究员,主要从事航空航天领域中的图像测量与视觉导航、计算机视觉的基础理论以及工程应用研究,并拓展其在医工交叉领域的应用。E-mail:lizhang08@nudt.edu.cn

  • 中图分类号: TN26;TP391.4

Multi-scale oriented object detection based on improved RoI Transformer in remote sensing images

  • 摘要:

    旋转目标检测是遥感图像处理领域中的重要任务,其存在的目标尺度变化大和目标方向任意等问题给自动目标检测带来了挑战。针对上述问题,提出了一种改进的RoI Transformer旋转目标检测框架:首先,利用RoI Transformer检测框架获取旋转的感兴趣区域特征(rotated region of interest, RRoI)用于鲁棒的几何特征提取;其次,在检测器中引入高分辨率网络(high-resolution network, HRNet)提取多分辨率特征图,在保持高分辨率特征同时适应目标的多尺度变化;最后,引入KLD(Kullback-Leibler divergence)损失,解决旋转目标表示的角度周期性的问题,提高检测方法对任意方向目标的适应性,并通过旋转目标边界框参数的联合优化提升目标定位精度。本文提出的旋转目标检测方法,即HRD-ROI Transformer (HRNet + KLD ROI Transformer),在DOTAv1.0和DIOR-R两个公开数据集上与典型的旋转目标检测方法进行了比较。结果显示:相比于传统的RoI Transformer检测框架,本文方法在DOTAv1.0和DIOR-R数据集上检测结果的mAP(mean-average-precision)分别提高了3.7%和4%。

    Abstract:

    Oriented object detection is a crucial task in remote sensing image processing. The large-scale variations and arbitrary orientations of objects bring challenges to automatic object detection. An improved RoI Transformer detection framework was proposed to address above-mentioned problems. Firstly, RoI Transformer detection framework was used to obtain rotated region of interest (RRoI) for extraction of robust geometric features. Secondly, high-resolution network (HRNet) was introduced in the detector to extract multi-resolution feature maps, which could maintain high-resolution features while adapting to multi-scale changes of the target. Finally, Kullback-Leibler divergence (KLD) loss was introduced to solve angle periodicity problem caused by the standard representation of oriented object, and improve the adaptability of RoI Transformer to targets in arbitrary directions. The object localization accuracy was also improved through the joint optimization of bounding box parameters of oriented object. The proposed method, called HRD-ROI Transformer (HRNet+KLD ROI Transformer), was compared with the typical oriented object detection method on two public datasets, namely DOTAv1.0 and DIOR-R. The results show that the mean-average-precision (mAP) of detection results on DOTAv1.0 and DIOR-R datasets is improved by 3.7% and 4%, respectively.

  • 图  1   遥感图像(第1行)和自然图像(第2行)对比图

    Figure  1.   Comparison between remote sensing images (the first row) and natural images (the second row)

    图  2   HRD-ROI Transformer结构图

    Figure  2.   Structure diagram of HRD-ROI Transformer

    图  3   HRNet结构图[18]

    Figure  3.   Structure diagram of HRNet[18]

    图  4   角度边界不连续性示意图

    Figure  4.   Schematic diagram of angle boundary discontinuity

    图  5   类正方形问题示意图

    Figure  5.   Schematic diagram of square-like problem

    图  6   检测结果对比(误检)

    Figure  6.   Comparison of detection results (false detection)

    图  7   检测结果对比(漏检)

    Figure  7.   Comparison of detection results (missed detection)

    图  8   检测结果对比(大长宽比目标)

    Figure  8.   Comparison of detection results (objects of large aspect ratios)

    图  9   KLD在DIOR-R上的有效性

    Figure  9.   Effectiveness of KLD on DIOR-R dataset

    图  10   机场检测结果

    Figure  10.   Detection results of airport

    图  11   高尔夫球场检测结果

    Figure  11.   Detection results of golf course

    表  1   不同方法在DOTAv1.0数据集上的表现对比

    Table  1   Performance comparison of different methods on DOTAv1.0 dataset

    MethodBackboneLossAP/%mAP/%
    PLBDBRGTFSVLVSHTCBCSTSBFRAHASPHC
    One-stage
    Rotated RetinaNetResNet50Smooth L189.775.040.864.166.567.785.890.762.665.754.462.062.652.254.566.3
    R3DetResNet50Smooth L189.573.244.465.366.977.287.290.857.966.251.363.272.153.054.667.5
    S2ANetResNet50Smooth L189.073.843.667.164.974.279.190.562.766.356.864.861.254.242.066.0
    SASM reppointsResNet50GIoU89.576.045.370.759.974.678.090.364.167.346.267.170.356.344.366.7
    Oriented reppointsResNet50GIoU89.775.749.870.774.180.588.490.565.168.647.164.670.457.854.669.8
    Two-stage
    Rotated Faster RCNNResNet50Smooth L188.574.744.170.063.771.479.490.558.762.054.764.563.258.250.166.3
    Oriented RCNNResNet50Smooth L189.175.850.068.362.384.088.890.668.762.357.063.666.457.339.168.2
    RoI TransformerResNet50Smooth L189.477.746.871.968.477.980.090.771.362.559.163.667.360.245.468.8
    ReDetReResNet50Smooth L189.678.047.468.865.882.487.490.667.569.763.465.967.353.048.769.7
    OursHRNetKLD89.875.454.778.968.878.689.390.775.762.867.067.275.360.752.172.5
    下载: 导出CSV

    表  2   不同方法在DIOR-R数据集上的表现对比

    Table  2   Performance comparison of different methods on DIOR-R dataset

    MethodBackboneLossAP/%mAP/%
    APLAPOBFBCBRCHESAETSDAMGFGTFHAOPSHSTASTOTCTSVEWM
    One-stage
    Rotated
    Retinanet
    ResNet50Smooth L159.115.070.481.114.572.664.946.614.670.974.724.830.267.069.150.181.241.632.561.952.1
    Rotated
    Retinanet-G
    ResNet50GWD64.621.172.981.113.172.768.545.814.770.175.127.230.668.966.157.981.247.434.861.553.8
    R3DetResNet50Smooth L153.327.968.981.022.972.666.449.619.268.476.022.141.568.357.955.481.145.535.754.053.4
    R3Det-KResNet50KLD57.834.969.481.228.572.771.853.216.171.877.136.447.674.562.560.881.350.039.856.257.2
    S2ANetResNet50KFIoU67.228.076.080.827.372.661.260.317.968.678.226.244.677.765.867.481.348.942.263.157.8
    SASM
    reppoints
    ResNet50GIoU61.252.174.582.732.472.576.058.134.971.377.138.651.579.164.866.380.760.541.764.262.0
    Oriented
    reppoints
    ResNet50GIoU68.741.975.184.035.475.479.565.832.175.078.643.451.880.366.566.485.454.046.265.063.5
    Two-stage
    Rotated
    Faster RCNN
    ResNet50Smooth L162.018.171.381.022.972.561.058.510.067.678.834.338.980.458.862.481.344.741.364.355.5
    Oriented
    RCNN
    ResNet50Smooth L161.826.771.681.333.872.674.058.423.766.880.029.952.081.062.562.481.450.642.365.058.9
    RoI
    Transformer
    ResNet50Smooth L163.130.771.881.533.972.775.864.624.367.482.535.751.181.270.570.881.544.443.466.060.7
    ReDetReResNet50Smooth L171.028.371.588.731.372.771.661.120.861.881.936.748.881.163.162.581.649.242.864.659.6
    OursHRNetKLD63.141.679.088.042.172.676.665.828.271.082.942.257.181.372.570.489.753.349.166.364.7
    下载: 导出CSV

    表  3   DOTAv1.0和DIOR-R数据集的小目标检测效果

    Table  3   Detection effects of small object on DOTAv1.0 and DIOR-R datasets

    MethodBackboneLossDIOR-R/%DOTAv1.0/%
    SHVEWMSVSH
    One-stage
    Rotated RetinaNetResNet50Smooth L167.032.561.966.585.8
    R3DetResNet50Smooth L168.335.754.066.987.2
    S2ANetResNet50Smooth L177.742.263.164.979.1
    SASM reppointsResNet50GIoU79.141.764.259.978.0
    Oriented reppointsResNet50GIoU80.346.265.074.188.4
    Two-stage
    Rotated Faster RCNNResNet50Smooth L180.441.364.363.779.4
    Oriented RCNNResNet50Smooth L181.042.365.062.388.8
    RoI TransformerResNet50Smooth L181.243.466.068.480.0
    ReDetReResNet50Smooth L181.142.864.665.887.4
    OursHRNetKLD81.349.166.368.889.3
    下载: 导出CSV

    表  4   KLD和HRNet在DOTAv1.0上的有效性对比

    Table  4   Comparison of effectiveness of KLD and HRNet on DOTAv1.0 dataset

    MethodKLDHRNetmAP/%
    Rotated Faster RCNN66.3
    RoI Transformer68.8
    Ours(a)70.3
    Ours(b)71.7
    Ours(c)72.5
    下载: 导出CSV

    表  5   KLD和HRNet在DIOR-R上的有效性对比

    Table  5   Comparison of effectiveness of KLD and HRNet on DIOR-R dataset

    MethodKLDHRNetmAP/%
    Rotated Faster RCNN55.5
    RoI Transformer60.7
    Ours(a)61.5
    Ours(b)63.9
    Ours(c)64.7
    下载: 导出CSV

    表  6   3种损失函数模型的mAP比较

    Table  6   Comparison of mAP for three loss function models

    Loss FunctionDOTAv1.0/%DIOR-R/%
    GWD69.261.4
    KFIOU68.960.3
    KLD70.361.5
    下载: 导出CSV
  • [1]

    LIU L, OUYANG W, WANG X G, et al. Deep learning for generic object detection: a survey[J]. International Journal of Computer Vision,2020,128(2):261-318. doi: 10.1007/s11263-019-01247-4

    [2] 符长虹, 陈锟辉, 鲁昆瀚, 等. 面向边缘智能光学感知的航空紧固件旋转检测[J]. 应用光学,2022,43(3):472-480.

    FU Changhong, CHEN Kunhui, LU Kunhan, et al. Aviation fastener rotation detection for intelligent optical perception with edge computing[J]. Journal of Applied Optics,2022,43(3):472-480.

    [3]

    DING J, XUE N, LONG Y, et al. Learning RoI transformer for oriented object detection in aerial images[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2019: 2849-2858.

    [4]

    QIAN W, YANG X, PENG S L, et al. Learning modulated loss for rotated object detection[C]// Proceedings of the AAAI conference on artificial intelligence. [S. l. ]: [s. n. ], 2021: 2458-2466.

    [5]

    MA J Q, SHAO W Y, YE H, et al. Arbitrary-oriented scene text detection via rotation proposals[J]. IEEE Transactions on Multimedia,2018,20(11):3111-3122. doi: 10.1109/TMM.2018.2818020

    [6]

    XIE X X, CHENG G, WANG J B, et al. Oriented r-cnn for object detection[C]// Proceedings of the IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 3520-3529.

    [7]

    HAN J M, DING J, XUE N, et al. Redet: a rotation-equivariant detector for aerial object detection[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 2786-2795.

    [8]

    HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2016: 770-778.

    [9]

    YANG X, YAN J C, MING Q, et al. Rethinking rotated object detection with Gaussian Wasserstein distance loss[C]// International Conference on Machine Learning. [S. l. ]: [s. n. ], 2021: 11830-11841.

    [10]

    YU Y, DA F P. Phase-shifting coder: predicting accurate orientation in oriented object detection[EB/OL]. [2023-06-19].https://doi.org/10.48550/arXiv.2211.06368.

    [11]

    YANG X, YAN J C, FENG Z M, et al. R3det: refined single-stage detector with feature refinement for rotating object [C]// Proceedings of the AAAI conference on artificial intelligence. [S. l. ]: [s. n. ], 2021: 3163-3171.

    [12]

    HOU L, LU K, XUE J, et al. Shape-adaptive selection and measurement for oriented object detection[C]// Proceedings of the AAAI Conference on Artificial Intelligence . [S. l. ]: [s. n. ], 2022: 923-932.

    [13]

    LI W, CHEN Y, HU K, et al. Oriented reppoints for aerial object detection[C]// Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. New York: IEEE, 2022: 1829-1838.

    [14] 吴烈权, 周志峰, 朱志玲, 等. 基于改进YOLO-V4的贴片二极管表面缺陷检测[J]. 应用光学,2023,44(3):621-627. doi: 10.5768/JAO202344.0303007

    WU Liequan, ZHOU Zhifeng, ZHU Zhiling, et al. Surface defect detection of patch diode based on improved YOLO-V4[J]. Journal of Applied Optics,2023,44(3):621-627. doi: 10.5768/JAO202344.0303007

    [15]

    YANG X, YANG X J, YANG J R, et al. Learning high-precision bounding box for rotated object detection via kullback leibler divergence[C]//Advances in Neural Information Processing Systems. Cambridge: MIT Press, 2021: 18381-18394.

    [16]

    YANG X, ZHOU Y, ZHANG G F, et al. The kfiou loss for rotated object detection[EB/OL]. [2023-06-19]. https://doi.org/10.48550/arXiv.2201.12558.

    [17]

    WANG K, LI Z, SU A, et al. Oriented object detection in optical remote sensing images: a survey[EB/OL]. [2023-06-19].https://doi.org/10.48550/arXiv.2302.10473.

    [18]

    WANG J D, SUN K, CHENG T S, et al. Deep high-resolution representation learning for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2021,43(10):3349-3364. doi: 10.1109/TPAMI.2020.2983686

    [19] 曹家乐, 李亚利, 孙汉卿, 等. 基于深度学习的视觉目标检测技术综述[J]. 中国图象图形学报,2022,27(6):1697-1722. doi: 10.11834/jig.220069

    CAO Jiale, LI Yali, SUN Hanqing, et al. A survey on deep learning based visual object detection[J]. Journal of Image and Graphics,2022,27(6):1697-1722. doi: 10.11834/jig.220069

    [20]

    YANG X, YAN J C, LIAO W L, et al. SCRDet++: detecting small, cluttered and rotated objects via instance-level feature denoising and rotation loss smoothing[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2023,45(2):2384-2399. doi: 10.1109/TPAMI.2022.3166956

    [21]

    HAN J, DING J, LI J, et al. Align deep features for oriented object detection[J]. IEEE Transactions on Geoscience and Remote Sensing,2022,60:1-11.

    [22]

    XIA G S, BAI X, DING J, et al. Dota: a large-scale dataset for object detection in aerial images[C]// Proceedings of the IEEE conference on computer vision and pattern recognition. New York: IEEE, 2018: 3974-3983.

    [23]

    CHENG G, WANG J B, LI K, et al. Anchor-free oriented proposal generator for object detection[J]. IEEE Transactions on Geoscience and Remote Sensing,2022,60:1-11.

    [24]

    LI K, WAN G, CHENG G, et al. Object detection in optical remote sensing images: a survey and a new benchmark[J]. ISPRS Journal of Photogrammetry and Remote Sensing,2020,159:296-307. doi: 10.1016/j.isprsjprs.2019.11.023

    [25]

    ZHOU Y, YANG X, ZHANG G F, et al. Mmrotate: a rotated object detection benchmark using pytorch[C]// Proceedings of the 30th ACM International Conference on Multimedia. [S. l. ]: [s. n. ], 2022: 7331-7334.

    [26]

    WU Y X, KIRILLOV A, MASSA F, et al. Detectron2. [2023-06-19].https://github.com/facebookresearch/detectron2.

    [27]

    WANG R S, DUAN Y F, LI Y K. Segmenting anything also detect anything. [2023-06-19].https://wwwww.easychair.org/publications/preprint_download/HVhP.

    [28]

    LI J, GONG Y X, MA Z, et al. Enhancing feature fusion using attention for small object detection[C]// 2022 IEEE 8th International Conference on Computer and Communications. New York: IEEE, 2022: 1859-1863.

    [29]

    YUAN Y, ZhANG Y L. OLCN: an optimized low coupling network for small objects detection[J]. IEEE Geoscience and Remote Sensing Letters,2021,19:1-5.

  • 期刊类型引用(8)

    1. 周会娟,余尚江,陈晋央,陈显,孟晓洁. 一种双面感压式光纤土压力传感器. 兵工学报. 2023(S1): 132-137 . 百度学术
    2. 吕欢祝,钟文博,秦亮,张克非. 聚合物封装的光纤光栅压力传感器的特性研究. 激光杂志. 2020(08): 63-67 . 百度学术
    3. 杨洋,赵勇,吕日清,刘兵,郑洪坤,杨洋,王孟军,崔盟军,杨华丽. 多参量一体化光纤传感器及标校系统的研制与开发. 红外与激光工程. 2019(10): 185-191 . 百度学术
    4. 吴国军,何少灵,桑卫兵. 温度精度补偿的光纤光栅土压力传感器. 机电工程技术. 2018(06): 47-49 . 百度学术
    5. 郭红英,王召巴. 基于光纤光栅的高压固体压力传感器研究. 分析化学. 2017(07): 980-986 . 百度学术
    6. 袁斌. 光纤Bragg光栅传感器的通信系统设计与实现. 激光杂志. 2017(12): 67-70 . 百度学术
    7. 李凯,赵振刚,李英娜,蔡陈,彭庆军,李川. FBG可变灵敏度压力传感器设计. 传感器与微系统. 2016(06): 69-71 . 百度学术
    8. 孙搏,隋青美,王静,曹帅帅,王宁,李海燕,刘斌. 微型布拉格光栅土压力传感器的设计与试验. 仪表技术与传感器. 2016(10): 20-23+27 . 百度学术

    其他类型引用(2)

图(11)  /  表(6)
计量
  • 文章访问数:  654
  • HTML全文浏览量:  98
  • PDF下载量:  101
  • 被引次数: 10
出版历程
  • 收稿日期:  2023-07-06
  • 修回日期:  2023-08-15
  • 网络出版日期:  2023-08-24
  • 刊出日期:  2023-09-14

目录

    /

    返回文章
    返回