基于改进RoI Transformer的遥感图像多尺度旋转目标检测

刘敏豪; 王堃; 金睿蛟; 卢天; 李璋

doi:10.5768/JAO202344.0502001

基于改进RoI Transformer的遥感图像多尺度旋转目标检测

刘敏豪^{1, 2,},
王堃^{1, 2},
金睿蛟^{1, 2},
卢天²,
李璋^{1, 2, ,}

1.
国防科技大学空天科学学院，湖南长沙 410000
2.
国防科技大学图像测量与视觉导航湖南省重点实验室，湖南长沙 410000

基金项目: 国家自然科学基金（61801491）

详细信息

作者简介:
刘敏豪（1999—），女，硕士研究生，主要从事深度学习与图像处理、旋转目标检测研究。E-mail：lmh313@nudt.edu.com

通讯作者:
李璋（1985—），男，博士，研究员，主要从事航空航天领域中的图像测量与视觉导航、计算机视觉的基础理论以及工程应用研究，并拓展其在医工交叉领域的应用。E-mail：lizhang08@nudt.edu.cn

中图分类号: TN26；TP391.4
计量
- 文章访问数: 654
- HTML全文浏览量: 98
- PDF下载量: 101
出版历程
- 收稿日期: 2023-07-06
- 修回日期: 2023-08-15
- 网络出版日期: 2023-08-24
- 刊出日期: 2023-09-14

Multi-scale oriented object detection based on improved RoI Transformer in remote sensing images

LIU Minhao^{1, 2,},
WANG Kun^{1, 2},
JIN Ruijiao^{1, 2},
LU Tian²,
LI Zhang^{1, 2, ,}

1.
College of Aerospace Science and Engineering, National University of Defense Technology, Changsha 410000, China
2.
Hunan Province Key Laboratory of Image Measurement and Vision Navigation, National University of Defense Technology, Changsha 410000, China

摘要

摘要:
旋转目标检测是遥感图像处理领域中的重要任务，其存在的目标尺度变化大和目标方向任意等问题给自动目标检测带来了挑战。针对上述问题，提出了一种改进的RoI Transformer旋转目标检测框架：首先，利用RoI Transformer检测框架获取旋转的感兴趣区域特征(rotated region of interest, RRoI)用于鲁棒的几何特征提取；其次，在检测器中引入高分辨率网络(high-resolution network, HRNet)提取多分辨率特征图，在保持高分辨率特征同时适应目标的多尺度变化；最后，引入KLD(Kullback-Leibler divergence)损失，解决旋转目标表示的角度周期性的问题，提高检测方法对任意方向目标的适应性，并通过旋转目标边界框参数的联合优化提升目标定位精度。本文提出的旋转目标检测方法，即HRD-ROI Transformer (HRNet + KLD ROI Transformer)，在DOTAv1.0和DIOR-R两个公开数据集上与典型的旋转目标检测方法进行了比较。结果显示：相比于传统的RoI Transformer检测框架，本文方法在DOTAv1.0和DIOR-R数据集上检测结果的mAP(mean-average-precision)分别提高了3.7%和4%。
- 旋转目标检测 /
- RoI Transformer /
- 高分辨率网络 /
- 遥感图像目标检测
Abstract:
Oriented object detection is a crucial task in remote sensing image processing. The large-scale variations and arbitrary orientations of objects bring challenges to automatic object detection. An improved RoI Transformer detection framework was proposed to address above-mentioned problems. Firstly, RoI Transformer detection framework was used to obtain rotated region of interest (RRoI) for extraction of robust geometric features. Secondly, high-resolution network (HRNet) was introduced in the detector to extract multi-resolution feature maps, which could maintain high-resolution features while adapting to multi-scale changes of the target. Finally, Kullback-Leibler divergence (KLD) loss was introduced to solve angle periodicity problem caused by the standard representation of oriented object, and improve the adaptability of RoI Transformer to targets in arbitrary directions. The object localization accuracy was also improved through the joint optimization of bounding box parameters of oriented object. The proposed method, called HRD-ROI Transformer (HRNet+KLD ROI Transformer), was compared with the typical oriented object detection method on two public datasets, namely DOTAv1.0 and DIOR-R. The results show that the mean-average-precision (mAP) of detection results on DOTAv1.0 and DIOR-R datasets is improved by 3.7% and 4%, respectively.
- oriented object detection /
- RoI Transformer /
- high-resolution network /
- object detection of remote sensing image

HTML全文

图 1 遥感图像(第1行)和自然图像(第2行)对比图

Figure 1. Comparison between remote sensing images (the first row) and natural images (the second row)

下载: 全尺寸图片幻灯片

图 2 HRD-ROI Transformer结构图

Figure 2. Structure diagram of HRD-ROI Transformer

下载: 全尺寸图片幻灯片

图 3 HRNet结构图^[18]

Figure 3. Structure diagram of HRNet^[18]

下载: 全尺寸图片幻灯片

图 4 角度边界不连续性示意图

Figure 4. Schematic diagram of angle boundary discontinuity

下载: 全尺寸图片幻灯片

图 5 类正方形问题示意图

Figure 5. Schematic diagram of square-like problem

下载: 全尺寸图片幻灯片

图 6 检测结果对比(误检)

Figure 6. Comparison of detection results (false detection)

下载: 全尺寸图片幻灯片

图 7 检测结果对比(漏检)

Figure 7. Comparison of detection results (missed detection)

下载: 全尺寸图片幻灯片

图 8 检测结果对比(大长宽比目标)

Figure 8. Comparison of detection results (objects of large aspect ratios)

下载: 全尺寸图片幻灯片

图 9 KLD在DIOR-R上的有效性

Figure 9. Effectiveness of KLD on DIOR-R dataset

下载: 全尺寸图片幻灯片

图 10 机场检测结果

Figure 10. Detection results of airport

下载: 全尺寸图片幻灯片

图 11 高尔夫球场检测结果

Figure 11. Detection results of golf course

下载: 全尺寸图片幻灯片

表 1 不同方法在DOTAv1.0数据集上的表现对比

Table 1 Performance comparison of different methods on DOTAv1.0 dataset

Method	Backbone	Loss	AP/%															mAP/%
Method	Backbone	Loss	PL	BD	BR	GTF	SV	LV	SH	TC	BC	ST	SBF	RA	HA	SP	HC	mAP/%
One-stage
Rotated RetinaNet	ResNet50	Smooth L1	89.7	75.0	40.8	64.1	66.5	67.7	85.8	90.7	62.6	65.7	54.4	62.0	62.6	52.2	54.5	66.3
R3Det	ResNet50	Smooth L1	89.5	73.2	44.4	65.3	66.9	77.2	87.2	90.8	57.9	66.2	51.3	63.2	72.1	53.0	54.6	67.5
S2ANet	ResNet50	Smooth L1	89.0	73.8	43.6	67.1	64.9	74.2	79.1	90.5	62.7	66.3	56.8	64.8	61.2	54.2	42.0	66.0
SASM reppoints	ResNet50	GIoU	89.5	76.0	45.3	70.7	59.9	74.6	78.0	90.3	64.1	67.3	46.2	67.1	70.3	56.3	44.3	66.7
Oriented reppoints	ResNet50	GIoU	89.7	75.7	49.8	70.7	74.1	80.5	88.4	90.5	65.1	68.6	47.1	64.6	70.4	57.8	54.6	69.8
Two-stage
Rotated Faster RCNN	ResNet50	Smooth L1	88.5	74.7	44.1	70.0	63.7	71.4	79.4	90.5	58.7	62.0	54.7	64.5	63.2	58.2	50.1	66.3
Oriented RCNN	ResNet50	Smooth L1	89.1	75.8	50.0	68.3	62.3	84.0	88.8	90.6	68.7	62.3	57.0	63.6	66.4	57.3	39.1	68.2
RoI Transformer	ResNet50	Smooth L1	89.4	77.7	46.8	71.9	68.4	77.9	80.0	90.7	71.3	62.5	59.1	63.6	67.3	60.2	45.4	68.8
ReDet	ReResNet50	Smooth L1	89.6	78.0	47.4	68.8	65.8	82.4	87.4	90.6	67.5	69.7	63.4	65.9	67.3	53.0	48.7	69.7
Ours	HRNet	KLD	89.8	75.4	54.7	78.9	68.8	78.6	89.3	90.7	75.7	62.8	67.0	67.2	75.3	60.7	52.1	72.5

下载: 导出CSV

表 2 不同方法在DIOR-R数据集上的表现对比

Table 2 Performance comparison of different methods on DIOR-R dataset

Method	Backbone	Loss	AP/%																				mAP/%
Method	Backbone	Loss	APL	APO	BF	BC	BR	CH	ESA	ETS	DAM	GF	GTF	HA	OP	SH	STA	STO	TC	TS	VE	WM	mAP/%
One-stage
Rotated Retinanet	ResNet50	Smooth L1	59.1	15.0	70.4	81.1	14.5	72.6	64.9	46.6	14.6	70.9	74.7	24.8	30.2	67.0	69.1	50.1	81.2	41.6	32.5	61.9	52.1
Rotated Retinanet-G	ResNet50	GWD	64.6	21.1	72.9	81.1	13.1	72.7	68.5	45.8	14.7	70.1	75.1	27.2	30.6	68.9	66.1	57.9	81.2	47.4	34.8	61.5	53.8
R3Det	ResNet50	Smooth L1	53.3	27.9	68.9	81.0	22.9	72.6	66.4	49.6	19.2	68.4	76.0	22.1	41.5	68.3	57.9	55.4	81.1	45.5	35.7	54.0	53.4
R3Det-K	ResNet50	KLD	57.8	34.9	69.4	81.2	28.5	72.7	71.8	53.2	16.1	71.8	77.1	36.4	47.6	74.5	62.5	60.8	81.3	50.0	39.8	56.2	57.2
S2ANet	ResNet50	KFIoU	67.2	28.0	76.0	80.8	27.3	72.6	61.2	60.3	17.9	68.6	78.2	26.2	44.6	77.7	65.8	67.4	81.3	48.9	42.2	63.1	57.8
SASM reppoints	ResNet50	GIoU	61.2	52.1	74.5	82.7	32.4	72.5	76.0	58.1	34.9	71.3	77.1	38.6	51.5	79.1	64.8	66.3	80.7	60.5	41.7	64.2	62.0
Oriented reppoints	ResNet50	GIoU	68.7	41.9	75.1	84.0	35.4	75.4	79.5	65.8	32.1	75.0	78.6	43.4	51.8	80.3	66.5	66.4	85.4	54.0	46.2	65.0	63.5
Two-stage
Rotated Faster RCNN	ResNet50	Smooth L1	62.0	18.1	71.3	81.0	22.9	72.5	61.0	58.5	10.0	67.6	78.8	34.3	38.9	80.4	58.8	62.4	81.3	44.7	41.3	64.3	55.5
Oriented RCNN	ResNet50	Smooth L1	61.8	26.7	71.6	81.3	33.8	72.6	74.0	58.4	23.7	66.8	80.0	29.9	52.0	81.0	62.5	62.4	81.4	50.6	42.3	65.0	58.9
RoI Transformer	ResNet50	Smooth L1	63.1	30.7	71.8	81.5	33.9	72.7	75.8	64.6	24.3	67.4	82.5	35.7	51.1	81.2	70.5	70.8	81.5	44.4	43.4	66.0	60.7
ReDet	ReResNet50	Smooth L1	71.0	28.3	71.5	88.7	31.3	72.7	71.6	61.1	20.8	61.8	81.9	36.7	48.8	81.1	63.1	62.5	81.6	49.2	42.8	64.6	59.6
Ours	HRNet	KLD	63.1	41.6	79.0	88.0	42.1	72.6	76.6	65.8	28.2	71.0	82.9	42.2	57.1	81.3	72.5	70.4	89.7	53.3	49.1	66.3	64.7

下载: 导出CSV

表 3 DOTAv1.0和DIOR-R数据集的小目标检测效果

Table 3 Detection effects of small object on DOTAv1.0 and DIOR-R datasets

Method	Backbone	Loss	DIOR-R/%			DOTAv1.0/%
Method	Backbone	Loss	SH	VE	WM	SV	SH
One-stage
Rotated RetinaNet	ResNet50	Smooth L1	67.0	32.5	61.9	66.5	85.8
R3Det	ResNet50	Smooth L1	68.3	35.7	54.0	66.9	87.2
S2ANet	ResNet50	Smooth L1	77.7	42.2	63.1	64.9	79.1
SASM reppoints	ResNet50	GIoU	79.1	41.7	64.2	59.9	78.0
Oriented reppoints	ResNet50	GIoU	80.3	46.2	65.0	74.1	88.4
Two-stage
Rotated Faster RCNN	ResNet50	Smooth L1	80.4	41.3	64.3	63.7	79.4
Oriented RCNN	ResNet50	Smooth L1	81.0	42.3	65.0	62.3	88.8
RoI Transformer	ResNet50	Smooth L1	81.2	43.4	66.0	68.4	80.0
ReDet	ReResNet50	Smooth L1	81.1	42.8	64.6	65.8	87.4
Ours	HRNet	KLD	81.3	49.1	66.3	68.8	89.3

下载: 导出CSV

表 4 KLD和HRNet在DOTAv1.0上的有效性对比

Table 4 Comparison of effectiveness of KLD and HRNet on DOTAv1.0 dataset

Method	KLD	HRNet	mAP/%
Rotated Faster RCNN			66.3
RoI Transformer			68.8
Ours(a)	√		70.3
Ours(b)		√	71.7
Ours(c)	√	√	72.5

下载: 导出CSV

表 5 KLD和HRNet在DIOR-R上的有效性对比

Table 5 Comparison of effectiveness of KLD and HRNet on DIOR-R dataset

Method	KLD	HRNet	mAP/%
Rotated Faster RCNN			55.5
RoI Transformer			60.7
Ours(a)	√		61.5
Ours(b)		√	63.9
Ours(c)	√	√	64.7

下载: 导出CSV

表 6 3种损失函数模型的mAP比较

Table 6 Comparison of mAP for three loss function models

Loss Function	DOTAv1.0/%	DIOR-R/%
GWD	69.2	61.4
KFIOU	68.9	60.3
KLD	70.3	61.5

下载: 导出CSV

参考文献(29)

[1]	LIU L, OUYANG W, WANG X G, et al. Deep learning for generic object detection: a survey[J]. International Journal of Computer Vision,2020,128(2):261-318. doi: 10.1007/s11263-019-01247-4
[2]	符长虹, 陈锟辉, 鲁昆瀚, 等. 面向边缘智能光学感知的航空紧固件旋转检测[J]. 应用光学,2022,43(3):472-480. FU Changhong, CHEN Kunhui, LU Kunhan, et al. Aviation fastener rotation detection for intelligent optical perception with edge computing[J]. Journal of Applied Optics,2022,43(3):472-480.
[3]	DING J, XUE N, LONG Y, et al. Learning RoI transformer for oriented object detection in aerial images[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2019: 2849-2858.
[4]	QIAN W, YANG X, PENG S L, et al. Learning modulated loss for rotated object detection[C]// Proceedings of the AAAI conference on artificial intelligence. [S. l. ]: [s. n. ], 2021: 2458-2466.
[5]	MA J Q, SHAO W Y, YE H, et al. Arbitrary-oriented scene text detection via rotation proposals[J]. IEEE Transactions on Multimedia,2018,20(11):3111-3122. doi: 10.1109/TMM.2018.2818020
[6]	XIE X X, CHENG G, WANG J B, et al. Oriented r-cnn for object detection[C]// Proceedings of the IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 3520-3529.
[7]	HAN J M, DING J, XUE N, et al. Redet: a rotation-equivariant detector for aerial object detection[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 2786-2795.
[8]	HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2016: 770-778.
[9]	YANG X, YAN J C, MING Q, et al. Rethinking rotated object detection with Gaussian Wasserstein distance loss[C]// International Conference on Machine Learning. [S. l. ]: [s. n. ], 2021: 11830-11841.
[10]	YU Y, DA F P. Phase-shifting coder: predicting accurate orientation in oriented object detection[EB/OL]. [2023-06-19].https://doi.org/10.48550/arXiv.2211.06368.
[11]	YANG X, YAN J C, FENG Z M, et al. R3det: refined single-stage detector with feature refinement for rotating object [C]// Proceedings of the AAAI conference on artificial intelligence. [S. l. ]: [s. n. ], 2021: 3163-3171.
[12]	HOU L, LU K, XUE J, et al. Shape-adaptive selection and measurement for oriented object detection[C]// Proceedings of the AAAI Conference on Artificial Intelligence . [S. l. ]: [s. n. ], 2022: 923-932.
[13]	LI W, CHEN Y, HU K, et al. Oriented reppoints for aerial object detection[C]// Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. New York: IEEE, 2022: 1829-1838.
[14]	吴烈权, 周志峰, 朱志玲, 等. 基于改进YOLO-V4的贴片二极管表面缺陷检测[J]. 应用光学,2023,44(3):621-627. doi: 10.5768/JAO202344.0303007 WU Liequan, ZHOU Zhifeng, ZHU Zhiling, et al. Surface defect detection of patch diode based on improved YOLO-V4[J]. Journal of Applied Optics,2023,44(3):621-627. doi: 10.5768/JAO202344.0303007
[15]	YANG X, YANG X J, YANG J R, et al. Learning high-precision bounding box for rotated object detection via kullback leibler divergence[C]//Advances in Neural Information Processing Systems. Cambridge: MIT Press, 2021: 18381-18394.
[16]	YANG X, ZHOU Y, ZHANG G F, et al. The kfiou loss for rotated object detection[EB/OL]. [2023-06-19]. https://doi.org/10.48550/arXiv.2201.12558.
[17]	WANG K, LI Z, SU A, et al. Oriented object detection in optical remote sensing images: a survey[EB/OL]. [2023-06-19].https://doi.org/10.48550/arXiv.2302.10473.
[18]	WANG J D, SUN K, CHENG T S, et al. Deep high-resolution representation learning for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2021,43(10):3349-3364. doi: 10.1109/TPAMI.2020.2983686
[19]	曹家乐, 李亚利, 孙汉卿, 等. 基于深度学习的视觉目标检测技术综述[J]. 中国图象图形学报,2022,27(6):1697-1722. doi: 10.11834/jig.220069 CAO Jiale, LI Yali, SUN Hanqing, et al. A survey on deep learning based visual object detection[J]. Journal of Image and Graphics,2022,27(6):1697-1722. doi: 10.11834/jig.220069
[20]	YANG X, YAN J C, LIAO W L, et al. SCRDet++: detecting small, cluttered and rotated objects via instance-level feature denoising and rotation loss smoothing[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2023,45(2):2384-2399. doi: 10.1109/TPAMI.2022.3166956
[21]	HAN J, DING J, LI J, et al. Align deep features for oriented object detection[J]. IEEE Transactions on Geoscience and Remote Sensing,2022,60:1-11.
[22]	XIA G S, BAI X, DING J, et al. Dota: a large-scale dataset for object detection in aerial images[C]// Proceedings of the IEEE conference on computer vision and pattern recognition. New York: IEEE, 2018: 3974-3983.
[23]	CHENG G, WANG J B, LI K, et al. Anchor-free oriented proposal generator for object detection[J]. IEEE Transactions on Geoscience and Remote Sensing,2022,60:1-11.
[24]	LI K, WAN G, CHENG G, et al. Object detection in optical remote sensing images: a survey and a new benchmark[J]. ISPRS Journal of Photogrammetry and Remote Sensing,2020,159:296-307. doi: 10.1016/j.isprsjprs.2019.11.023
[25]	ZHOU Y, YANG X, ZHANG G F, et al. Mmrotate: a rotated object detection benchmark using pytorch[C]// Proceedings of the 30th ACM International Conference on Multimedia. [S. l. ]: [s. n. ], 2022: 7331-7334.
[26]	WU Y X, KIRILLOV A, MASSA F, et al. Detectron2. [2023-06-19].https://github.com/facebookresearch/detectron2.
[27]	WANG R S, DUAN Y F, LI Y K. Segmenting anything also detect anything. [2023-06-19].https://wwwww.easychair.org/publications/preprint_download/HVhP.
[28]	LI J, GONG Y X, MA Z, et al. Enhancing feature fusion using attention for small object detection[C]// 2022 IEEE 8th International Conference on Computer and Communications. New York: IEEE, 2022: 1859-1863.
[29]	YUAN Y, ZhANG Y L. OLCN: an optimized low coupling network for small objects detection[J]. IEEE Geoscience and Remote Sensing Letters,2021,19:1-5.

施引文献(10)

期刊类型引用(8)

1.	周会娟，余尚江，陈晋央，陈显，孟晓洁. 一种双面感压式光纤土压力传感器. 兵工学报. 2023(S1): 132-137 . 百度学术
2.	吕欢祝，钟文博，秦亮，张克非. 聚合物封装的光纤光栅压力传感器的特性研究. 激光杂志. 2020(08): 63-67 . 百度学术
3.	杨洋，赵勇，吕日清，刘兵，郑洪坤，杨洋，王孟军，崔盟军，杨华丽. 多参量一体化光纤传感器及标校系统的研制与开发. 红外与激光工程. 2019(10): 185-191 . 百度学术
4.	吴国军，何少灵，桑卫兵. 温度精度补偿的光纤光栅土压力传感器. 机电工程技术. 2018(06): 47-49 . 百度学术
5.	郭红英，王召巴. 基于光纤光栅的高压固体压力传感器研究. 分析化学. 2017(07): 980-986 . 百度学术
6.	袁斌. 光纤Bragg光栅传感器的通信系统设计与实现. 激光杂志. 2017(12): 67-70 . 百度学术
7.	李凯，赵振刚，李英娜，蔡陈，彭庆军，李川. FBG可变灵敏度压力传感器设计. 传感器与微系统. 2016(06): 69-71 . 百度学术
8.	孙搏，隋青美，王静，曹帅帅，王宁，李海燕，刘斌. 微型布拉格光栅土压力传感器的设计与试验. 仪表技术与传感器. 2016(10): 20-23+27 . 百度学术

其他类型引用(2)

资源附件(0)

图(11) / 表(6)

计量

文章访问数: 654
HTML全文浏览量: 98
PDF下载量: 101
被引次数: 10

基于改进RoI Transformer的遥感图像多尺度旋转目标检测

作者简介:
刘敏豪（1999—），女，硕士研究生，主要从事深度学习与图像处理、旋转目标检测研究。E-mail：lmh313@nudt.edu.com

通讯作者:
李璋（1985—），男，博士，研究员，主要从事航空航天领域中的图像测量与视觉导航、计算机视觉的基础理论以及工程应用研究，并拓展其在医工交叉领域的应用。E-mail：lizhang08@nudt.edu.cn

计量

Multi-scale oriented object detection based on improved RoI Transformer in remote sensing images

期刊类型引用(8)

其他类型引用(2)

计量

目录

相关数据库

广告合作

友情链接

帮助中心

基于改进RoI Transformer的遥感图像多尺度旋转目标检测

作者简介: 刘敏豪（1999—），女，硕士研究生，主要从事深度学习与图像处理、旋转目标检测研究。E-mail：lmh313@nudt.edu.com

通讯作者: 李璋（1985—），男，博士，研究员，主要从事航空航天领域中的图像测量与视觉导航、计算机视觉的基础理论以及工程应用研究，并拓展其在医工交叉领域的应用。E-mail：lizhang08@nudt.edu.cn

计量

出版历程

Multi-scale oriented object detection based on improved RoI Transformer in remote sensing images

期刊类型引用(8)

其他类型引用(2)

计量

出版历程

目录

相关数据库

广告合作

友情链接

帮助中心

作者简介:
刘敏豪（1999—），女，硕士研究生，主要从事深度学习与图像处理、旋转目标检测研究。E-mail：lmh313@nudt.edu.com

通讯作者:
李璋（1985—），男，博士，研究员，主要从事航空航天领域中的图像测量与视觉导航、计算机视觉的基础理论以及工程应用研究，并拓展其在医工交叉领域的应用。E-mail：lizhang08@nudt.edu.cn