Multi-scale oriented object detection based on improved RoI Transformer in remote sensing images
-
摘要:
旋转目标检测是遥感图像处理领域中的重要任务,其存在的目标尺度变化大和目标方向任意等问题给自动目标检测带来了挑战。针对上述问题,提出了一种改进的RoI Transformer旋转目标检测框架:首先,利用RoI Transformer检测框架获取旋转的感兴趣区域特征(rotated region of interest, RRoI)用于鲁棒的几何特征提取;其次,在检测器中引入高分辨率网络(high-resolution network, HRNet)提取多分辨率特征图,在保持高分辨率特征同时适应目标的多尺度变化;最后,引入KLD(Kullback-Leibler divergence)损失,解决旋转目标表示的角度周期性的问题,提高检测方法对任意方向目标的适应性,并通过旋转目标边界框参数的联合优化提升目标定位精度。本文提出的旋转目标检测方法,即HRD-ROI Transformer (HRNet + KLD ROI Transformer),在DOTAv1.0和DIOR-R两个公开数据集上与典型的旋转目标检测方法进行了比较。结果显示:相比于传统的RoI Transformer检测框架,本文方法在DOTAv1.0和DIOR-R数据集上检测结果的mAP(mean-average-precision)分别提高了3.7%和4%。
-
关键词:
- 旋转目标检测 /
- RoI Transformer /
- 高分辨率网络 /
- 遥感图像目标检测
Abstract:Oriented object detection is a crucial task in remote sensing image processing. The large-scale variations and arbitrary orientations of objects bring challenges to automatic object detection. An improved RoI Transformer detection framework was proposed to address above-mentioned problems. Firstly, RoI Transformer detection framework was used to obtain rotated region of interest (RRoI) for extraction of robust geometric features. Secondly, high-resolution network (HRNet) was introduced in the detector to extract multi-resolution feature maps, which could maintain high-resolution features while adapting to multi-scale changes of the target. Finally, Kullback-Leibler divergence (KLD) loss was introduced to solve angle periodicity problem caused by the standard representation of oriented object, and improve the adaptability of RoI Transformer to targets in arbitrary directions. The object localization accuracy was also improved through the joint optimization of bounding box parameters of oriented object. The proposed method, called HRD-ROI Transformer (HRNet+KLD ROI Transformer), was compared with the typical oriented object detection method on two public datasets, namely DOTAv1.0 and DIOR-R. The results show that the mean-average-precision (mAP) of detection results on DOTAv1.0 and DIOR-R datasets is improved by 3.7% and 4%, respectively.
-
-
表 1 不同方法在DOTAv1.0数据集上的表现对比
Table 1 Performance comparison of different methods on DOTAv1.0 dataset
Method Backbone Loss AP/% mAP/% PL BD BR GTF SV LV SH TC BC ST SBF RA HA SP HC One-stage Rotated RetinaNet ResNet50 Smooth L1 89.7 75.0 40.8 64.1 66.5 67.7 85.8 90.7 62.6 65.7 54.4 62.0 62.6 52.2 54.5 66.3 R3Det ResNet50 Smooth L1 89.5 73.2 44.4 65.3 66.9 77.2 87.2 90.8 57.9 66.2 51.3 63.2 72.1 53.0 54.6 67.5 S2ANet ResNet50 Smooth L1 89.0 73.8 43.6 67.1 64.9 74.2 79.1 90.5 62.7 66.3 56.8 64.8 61.2 54.2 42.0 66.0 SASM reppoints ResNet50 GIoU 89.5 76.0 45.3 70.7 59.9 74.6 78.0 90.3 64.1 67.3 46.2 67.1 70.3 56.3 44.3 66.7 Oriented reppoints ResNet50 GIoU 89.7 75.7 49.8 70.7 74.1 80.5 88.4 90.5 65.1 68.6 47.1 64.6 70.4 57.8 54.6 69.8 Two-stage Rotated Faster RCNN ResNet50 Smooth L1 88.5 74.7 44.1 70.0 63.7 71.4 79.4 90.5 58.7 62.0 54.7 64.5 63.2 58.2 50.1 66.3 Oriented RCNN ResNet50 Smooth L1 89.1 75.8 50.0 68.3 62.3 84.0 88.8 90.6 68.7 62.3 57.0 63.6 66.4 57.3 39.1 68.2 RoI Transformer ResNet50 Smooth L1 89.4 77.7 46.8 71.9 68.4 77.9 80.0 90.7 71.3 62.5 59.1 63.6 67.3 60.2 45.4 68.8 ReDet ReResNet50 Smooth L1 89.6 78.0 47.4 68.8 65.8 82.4 87.4 90.6 67.5 69.7 63.4 65.9 67.3 53.0 48.7 69.7 Ours HRNet KLD 89.8 75.4 54.7 78.9 68.8 78.6 89.3 90.7 75.7 62.8 67.0 67.2 75.3 60.7 52.1 72.5 表 2 不同方法在DIOR-R数据集上的表现对比
Table 2 Performance comparison of different methods on DIOR-R dataset
Method Backbone Loss AP/% mAP/% APL APO BF BC BR CH ESA ETS DAM GF GTF HA OP SH STA STO TC TS VE WM One-stage Rotated
RetinanetResNet50 Smooth L1 59.1 15.0 70.4 81.1 14.5 72.6 64.9 46.6 14.6 70.9 74.7 24.8 30.2 67.0 69.1 50.1 81.2 41.6 32.5 61.9 52.1 Rotated
Retinanet-GResNet50 GWD 64.6 21.1 72.9 81.1 13.1 72.7 68.5 45.8 14.7 70.1 75.1 27.2 30.6 68.9 66.1 57.9 81.2 47.4 34.8 61.5 53.8 R3Det ResNet50 Smooth L1 53.3 27.9 68.9 81.0 22.9 72.6 66.4 49.6 19.2 68.4 76.0 22.1 41.5 68.3 57.9 55.4 81.1 45.5 35.7 54.0 53.4 R3Det-K ResNet50 KLD 57.8 34.9 69.4 81.2 28.5 72.7 71.8 53.2 16.1 71.8 77.1 36.4 47.6 74.5 62.5 60.8 81.3 50.0 39.8 56.2 57.2 S2ANet ResNet50 KFIoU 67.2 28.0 76.0 80.8 27.3 72.6 61.2 60.3 17.9 68.6 78.2 26.2 44.6 77.7 65.8 67.4 81.3 48.9 42.2 63.1 57.8 SASM
reppointsResNet50 GIoU 61.2 52.1 74.5 82.7 32.4 72.5 76.0 58.1 34.9 71.3 77.1 38.6 51.5 79.1 64.8 66.3 80.7 60.5 41.7 64.2 62.0 Oriented
reppointsResNet50 GIoU 68.7 41.9 75.1 84.0 35.4 75.4 79.5 65.8 32.1 75.0 78.6 43.4 51.8 80.3 66.5 66.4 85.4 54.0 46.2 65.0 63.5 Two-stage Rotated
Faster RCNNResNet50 Smooth L1 62.0 18.1 71.3 81.0 22.9 72.5 61.0 58.5 10.0 67.6 78.8 34.3 38.9 80.4 58.8 62.4 81.3 44.7 41.3 64.3 55.5 Oriented
RCNNResNet50 Smooth L1 61.8 26.7 71.6 81.3 33.8 72.6 74.0 58.4 23.7 66.8 80.0 29.9 52.0 81.0 62.5 62.4 81.4 50.6 42.3 65.0 58.9 RoI
TransformerResNet50 Smooth L1 63.1 30.7 71.8 81.5 33.9 72.7 75.8 64.6 24.3 67.4 82.5 35.7 51.1 81.2 70.5 70.8 81.5 44.4 43.4 66.0 60.7 ReDet ReResNet50 Smooth L1 71.0 28.3 71.5 88.7 31.3 72.7 71.6 61.1 20.8 61.8 81.9 36.7 48.8 81.1 63.1 62.5 81.6 49.2 42.8 64.6 59.6 Ours HRNet KLD 63.1 41.6 79.0 88.0 42.1 72.6 76.6 65.8 28.2 71.0 82.9 42.2 57.1 81.3 72.5 70.4 89.7 53.3 49.1 66.3 64.7 表 3 DOTAv1.0和DIOR-R数据集的小目标检测效果
Table 3 Detection effects of small object on DOTAv1.0 and DIOR-R datasets
Method Backbone Loss DIOR-R/% DOTAv1.0/% SH VE WM SV SH One-stage Rotated RetinaNet ResNet50 Smooth L1 67.0 32.5 61.9 66.5 85.8 R3Det ResNet50 Smooth L1 68.3 35.7 54.0 66.9 87.2 S2ANet ResNet50 Smooth L1 77.7 42.2 63.1 64.9 79.1 SASM reppoints ResNet50 GIoU 79.1 41.7 64.2 59.9 78.0 Oriented reppoints ResNet50 GIoU 80.3 46.2 65.0 74.1 88.4 Two-stage Rotated Faster RCNN ResNet50 Smooth L1 80.4 41.3 64.3 63.7 79.4 Oriented RCNN ResNet50 Smooth L1 81.0 42.3 65.0 62.3 88.8 RoI Transformer ResNet50 Smooth L1 81.2 43.4 66.0 68.4 80.0 ReDet ReResNet50 Smooth L1 81.1 42.8 64.6 65.8 87.4 Ours HRNet KLD 81.3 49.1 66.3 68.8 89.3 表 4 KLD和HRNet在DOTAv1.0上的有效性对比
Table 4 Comparison of effectiveness of KLD and HRNet on DOTAv1.0 dataset
Method KLD HRNet mAP/% Rotated Faster RCNN 66.3 RoI Transformer 68.8 Ours(a) √ 70.3 Ours(b) √ 71.7 Ours(c) √ √ 72.5 表 5 KLD和HRNet在DIOR-R上的有效性对比
Table 5 Comparison of effectiveness of KLD and HRNet on DIOR-R dataset
Method KLD HRNet mAP/% Rotated Faster RCNN 55.5 RoI Transformer 60.7 Ours(a) √ 61.5 Ours(b) √ 63.9 Ours(c) √ √ 64.7 表 6 3种损失函数模型的mAP比较
Table 6 Comparison of mAP for three loss function models
Loss Function DOTAv1.0/% DIOR-R/% GWD 69.2 61.4 KFIOU 68.9 60.3 KLD 70.3 61.5 -
[1] LIU L, OUYANG W, WANG X G, et al. Deep learning for generic object detection: a survey[J]. International Journal of Computer Vision,2020,128(2):261-318. doi: 10.1007/s11263-019-01247-4
[2] 符长虹, 陈锟辉, 鲁昆瀚, 等. 面向边缘智能光学感知的航空紧固件旋转检测[J]. 应用光学,2022,43(3):472-480. FU Changhong, CHEN Kunhui, LU Kunhan, et al. Aviation fastener rotation detection for intelligent optical perception with edge computing[J]. Journal of Applied Optics,2022,43(3):472-480.
[3] DING J, XUE N, LONG Y, et al. Learning RoI transformer for oriented object detection in aerial images[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2019: 2849-2858.
[4] QIAN W, YANG X, PENG S L, et al. Learning modulated loss for rotated object detection[C]// Proceedings of the AAAI conference on artificial intelligence. [S. l. ]: [s. n. ], 2021: 2458-2466.
[5] MA J Q, SHAO W Y, YE H, et al. Arbitrary-oriented scene text detection via rotation proposals[J]. IEEE Transactions on Multimedia,2018,20(11):3111-3122. doi: 10.1109/TMM.2018.2818020
[6] XIE X X, CHENG G, WANG J B, et al. Oriented r-cnn for object detection[C]// Proceedings of the IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 3520-3529.
[7] HAN J M, DING J, XUE N, et al. Redet: a rotation-equivariant detector for aerial object detection[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 2786-2795.
[8] HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2016: 770-778.
[9] YANG X, YAN J C, MING Q, et al. Rethinking rotated object detection with Gaussian Wasserstein distance loss[C]// International Conference on Machine Learning. [S. l. ]: [s. n. ], 2021: 11830-11841.
[10] YU Y, DA F P. Phase-shifting coder: predicting accurate orientation in oriented object detection[EB/OL]. [2023-06-19].https://doi.org/10.48550/arXiv.2211.06368.
[11] YANG X, YAN J C, FENG Z M, et al. R3det: refined single-stage detector with feature refinement for rotating object [C]// Proceedings of the AAAI conference on artificial intelligence. [S. l. ]: [s. n. ], 2021: 3163-3171.
[12] HOU L, LU K, XUE J, et al. Shape-adaptive selection and measurement for oriented object detection[C]// Proceedings of the AAAI Conference on Artificial Intelligence . [S. l. ]: [s. n. ], 2022: 923-932.
[13] LI W, CHEN Y, HU K, et al. Oriented reppoints for aerial object detection[C]// Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. New York: IEEE, 2022: 1829-1838.
[14] 吴烈权, 周志峰, 朱志玲, 等. 基于改进YOLO-V4的贴片二极管表面缺陷检测[J]. 应用光学,2023,44(3):621-627. doi: 10.5768/JAO202344.0303007 WU Liequan, ZHOU Zhifeng, ZHU Zhiling, et al. Surface defect detection of patch diode based on improved YOLO-V4[J]. Journal of Applied Optics,2023,44(3):621-627. doi: 10.5768/JAO202344.0303007
[15] YANG X, YANG X J, YANG J R, et al. Learning high-precision bounding box for rotated object detection via kullback leibler divergence[C]//Advances in Neural Information Processing Systems. Cambridge: MIT Press, 2021: 18381-18394.
[16] YANG X, ZHOU Y, ZHANG G F, et al. The kfiou loss for rotated object detection[EB/OL]. [2023-06-19]. https://doi.org/10.48550/arXiv.2201.12558.
[17] WANG K, LI Z, SU A, et al. Oriented object detection in optical remote sensing images: a survey[EB/OL]. [2023-06-19].https://doi.org/10.48550/arXiv.2302.10473.
[18] WANG J D, SUN K, CHENG T S, et al. Deep high-resolution representation learning for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2021,43(10):3349-3364. doi: 10.1109/TPAMI.2020.2983686
[19] 曹家乐, 李亚利, 孙汉卿, 等. 基于深度学习的视觉目标检测技术综述[J]. 中国图象图形学报,2022,27(6):1697-1722. doi: 10.11834/jig.220069 CAO Jiale, LI Yali, SUN Hanqing, et al. A survey on deep learning based visual object detection[J]. Journal of Image and Graphics,2022,27(6):1697-1722. doi: 10.11834/jig.220069
[20] YANG X, YAN J C, LIAO W L, et al. SCRDet++: detecting small, cluttered and rotated objects via instance-level feature denoising and rotation loss smoothing[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2023,45(2):2384-2399. doi: 10.1109/TPAMI.2022.3166956
[21] HAN J, DING J, LI J, et al. Align deep features for oriented object detection[J]. IEEE Transactions on Geoscience and Remote Sensing,2022,60:1-11.
[22] XIA G S, BAI X, DING J, et al. Dota: a large-scale dataset for object detection in aerial images[C]// Proceedings of the IEEE conference on computer vision and pattern recognition. New York: IEEE, 2018: 3974-3983.
[23] CHENG G, WANG J B, LI K, et al. Anchor-free oriented proposal generator for object detection[J]. IEEE Transactions on Geoscience and Remote Sensing,2022,60:1-11.
[24] LI K, WAN G, CHENG G, et al. Object detection in optical remote sensing images: a survey and a new benchmark[J]. ISPRS Journal of Photogrammetry and Remote Sensing,2020,159:296-307. doi: 10.1016/j.isprsjprs.2019.11.023
[25] ZHOU Y, YANG X, ZHANG G F, et al. Mmrotate: a rotated object detection benchmark using pytorch[C]// Proceedings of the 30th ACM International Conference on Multimedia. [S. l. ]: [s. n. ], 2022: 7331-7334.
[26] WU Y X, KIRILLOV A, MASSA F, et al. Detectron2. [2023-06-19].https://github.com/facebookresearch/detectron2.
[27] WANG R S, DUAN Y F, LI Y K. Segmenting anything also detect anything. [2023-06-19].https://wwwww.easychair.org/publications/preprint_download/HVhP.
[28] LI J, GONG Y X, MA Z, et al. Enhancing feature fusion using attention for small object detection[C]// 2022 IEEE 8th International Conference on Computer and Communications. New York: IEEE, 2022: 1859-1863.
[29] YUAN Y, ZhANG Y L. OLCN: an optimized low coupling network for small objects detection[J]. IEEE Geoscience and Remote Sensing Letters,2021,19:1-5.
-
期刊类型引用(8)
1. 周会娟,余尚江,陈晋央,陈显,孟晓洁. 一种双面感压式光纤土压力传感器. 兵工学报. 2023(S1): 132-137 . 百度学术
2. 吕欢祝,钟文博,秦亮,张克非. 聚合物封装的光纤光栅压力传感器的特性研究. 激光杂志. 2020(08): 63-67 . 百度学术
3. 杨洋,赵勇,吕日清,刘兵,郑洪坤,杨洋,王孟军,崔盟军,杨华丽. 多参量一体化光纤传感器及标校系统的研制与开发. 红外与激光工程. 2019(10): 185-191 . 百度学术
4. 吴国军,何少灵,桑卫兵. 温度精度补偿的光纤光栅土压力传感器. 机电工程技术. 2018(06): 47-49 . 百度学术
5. 郭红英,王召巴. 基于光纤光栅的高压固体压力传感器研究. 分析化学. 2017(07): 980-986 . 百度学术
6. 袁斌. 光纤Bragg光栅传感器的通信系统设计与实现. 激光杂志. 2017(12): 67-70 . 百度学术
7. 李凯,赵振刚,李英娜,蔡陈,彭庆军,李川. FBG可变灵敏度压力传感器设计. 传感器与微系统. 2016(06): 69-71 . 百度学术
8. 孙搏,隋青美,王静,曹帅帅,王宁,李海燕,刘斌. 微型布拉格光栅土压力传感器的设计与试验. 仪表技术与传感器. 2016(10): 20-23+27 . 百度学术
其他类型引用(2)