复杂场景下多尺度船舶实时检测方法

doi:10.11959/j.issn.1000-0801.2022258

摘要/Abstract

摘要：

船舶检测在军事侦察、海上目标跟踪、海上交通管制等任务中发挥着重要作用。然而，受船舶外形尺度多变和复杂海面背景的影响，在复杂海面上检测多尺度船舶仍然是一个挑战。针对此难题，提出了一种基于多层信息交互融合和注意力机制的 YOLOv4 改进方法。该方法主要通过多层信息交互融合（multi-layer information interactive fusion，MLIF）模块和多注意感受野（multi-attention receptive field，MARF）模块构建一个双向细粒度特征金字塔。其中，MLIF模块用于融合不同尺度的特征，不仅能将深层的高级语义特征串联在一起，而且将较浅层的丰富特征进行重塑；MARF由感受野模块（receptive field block，RFB）与注意力机制模块组成，能有效地强调重要特征并抑制冗余特征。此外，为了进一步评估提出方法的性能，在新加坡海事数据集（Singapore maritime dataset，SMD）上进行了实验。实验结果表明，所提方法能有效地解决复杂海洋环境下多尺度船舶检测的难题，且同时满足了实时需求。

关键词: 多尺度船舶检测, 多层信息交互融合, 多注意感受野, 双向细粒度特征金字塔

Abstract:

Ship detection plays an important role in tasks such as military reconnaissance, maritime target tracking, and maritime traffic control.However, due to the influence of variable sizes of ships and complex background of sea surface, detecting multi-scale ships remains a challenge in complex sea surfaces.To solve this problem, an improved YOLOv4 method based on multi-layers information interactive fusion and attention mechanism was proposed.Multi-layers information interactive fusion (MLIF) and multi-attention receptive field (MARF) were applied and combined reasonably to build a bidirectional fine-grained feature pyramid.MLIF was used to fuse feature of different scales, which not only concatenated high-level semantic features from deep layers, but also reshaped richer features from shallower layers.MARF consisted of receptive field block (RFB) and attention mechanism module, which effectively emphasized the important features and suppressed unnecessary ones.In addition, to further evaluate the performance of the proposed method, experiments were carried out on Singapore maritime dataset (SMD).The results illustrate that the method proposed can effectively solve the problem of difficult detection of multi-scale ships in complex marine environment, and meet the real-time requirements at the same time.

Key words: multi-scale ship detection, multi-layers information interactive fusion, multi-attention receptive field, bidirectional fine-grained feature pyramid

中图分类号:

TP391

周薇娜, 刘露. 复杂场景下多尺度船舶实时检测方法[J]. 电信科学, 2022, 38(10): 67-78.

Weina ZHOU, Lu LIU. A real-time detection method for multi-scale ships in complex scenes[J]. Telecommunications Science, 2022, 38(10): 67-78.

图/表 11

图1

图2

图3

图4

图5

表1

表2

不同模块对多尺度船舶检测精度对比"

对比项		mAP		FPS
对比项	大尺寸	中尺寸	小尺寸	FPS
FPN-PAN	0.675	0.375	0.192	27.2
MLIF-PAN	0.749	0.665	0.511	25.1
本文算法（MLIF-MARF）	$0 . 781$	$0 . 679$	$0 . 601$	26.0

表2

图6

表3

与其他目标检测方法的对比实验结果"

方法	R	P	F1	mAP	FPS
Faster-RCNN^[7]	0.741	0.851	0.786	0.775	7.3
RetinaNet^[22]	$0 . 696$	0.784	0.731	0.703	22.9
SSD^[11]	0.306	0.768	0.403	0.473	40.0
CenterNet^[24]	0.673	0.794	0.712	0.688	52.3
YOLOx^[23]	0.644	0.775	0.689	0.670	$59 . 9$
YOLOv3^[10]	0.512	0.780	0.593	0.579	10.4
YOLOv4^[21]	0.618	0.747	0.661	0.648	27.2
本文算法	0.678	$0 . 865$	$0 . 744$	$0 . 769$	26.0

表3

图7

图8

参考文献 30

[1]	HUANG J , JIANG Z G , ZHANG H P ,et al. Region proposal for ship detection based on structured forests edge method[C]// Proceedings of 2017 IEEE International Geoscience and Remote Sensing Symposium. Piscataway:IEEE Press, 2017: 1856-1859.
[2]	ZHU Q Y , JIANG Y L , CHEN B . Design and implementation of video-based detection system for WHARF ship[C]// Proceedings of IET International Conference on Smart and Sustainable City 2013 (ICSSC 2013). IET, 2013: 493-496.
[3]	LI S , ZHOU Z Q , WANG B ,et al. A novel inshore ship detection via ship head classification and body boundary determination[J]. IEEE Geoscience and Remote Sensing Letters, 2016,13(12): 1920-1924.
[4]	LIU L , WANG X G , CHEN J ,et al. Deep learning for generic object detection:a survey[J]. International Journal of Computer Vision, 2020,128(2): 261-318.
[5]	GIRSHICK R , DONAHUE J , DARRELL T ,et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]// Proceedings of 27th IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2014: 580-587.
[6]	GIRSHICK R , . Fast R-CNN[C]// Proceedings of 2015 IEEE International Conference on Computer Vision (ICCV). Piscataway:IEEE Press, 2015: 1440-1448.
[7]	REN S Q , HE K M , GIRSHICK R ,et al. Faster R-CNN:towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017,39(6): 1137-1149.
[8]	REDMON J , DIVVALA S , GIRSHICK R ,et al. You only look once:unified,real-time object detection[C]// Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway:IEEE Press, 2016: 779-788.
[9]	REDMON J , FARHADI A . YOLO9000:better,faster,stronger[C]// Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2017: 7263-7271.
[10]	REDMON J , FARHADI A . YOLOv3:an incremental improvement[EB]. 2018:arXiv.1804.02767.
[11]	LIU W , ANGUELOV D , ERHAN D ,et al. SSD:single shot multibox detector[C]// Proceedings of European Conference on Computer Vision. Cham:Springer International Publishing, 2016: 21-37.
[12]	张佳欣, 王华力 . 改进YOLOv3的SAR图像舰船目标检测[J]. 信号处理, 2021,37(9): 1623-1632.
	ZHANG J X , WANG H L . Ship target detection in SAR image based on improved YOLOv3[J]. Journal of Signal Processing, 2021,37(9): 1623-1632.
[13]	PENG X L , ZHONG R F , LI Z ,et al. Optical remote sensing image change detection based on attention mechanism and image difference[J]. IEEE Transactions on Geoscience and Remote Sensing, 2020,59(9): 7296-7307.
[14]	张筱晗, 姚力波, 吕亚飞 ,等. 双向特征融合的数据自适应SAR 图像舰船目标检测模型[J]. 中国图象图形学报, 2020,25(9): 1943-1952.
	ZHANG X H , YAO L B , LYU Y F ,et al. Data-adaptive single-shot ship detector with a bidirectional feature fusion module for SAR images[J]. Journal of Image and Graphics, 2020,25(9): 1943-1952.
[15]	SHAO Z F , WU W J , WANG Z Y ,et al. SeaShips:a large-scale precisely annotated dataset for ship detection[J]. IEEE Transactions on Multimedia, 2018,20(10): 2593-2604.
[16]	LI Y , GUO J , GUO X ,et al. A novel target detection method of the unmanned surface vehicle under all-weather conditions with an improved YOLOv3[J]. Sensors, 2020,20(17): 4885.
[17]	SHAO Z , WANG L , WANG Z ,et al. Saliency-aware convolution neural network for ship detection in surveillance video[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2019,30(3): 781-794.
[18]	WANG C Y , LIAO H , WU Y H ,et al. CSPNet:a new backbone that can enhance learning capability of CNN[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Piscataway:IEEE Press, 2020: 390-391.
[19]	LIU S , HUANG D . Receptive field block net for accurate and fast object detection[C]// Proceedings of the European Conference on Computer Vision (ECCV). 2018: 385-400.
[20]	WOO S , PARK J , LEE J Y ,et al. Cbam:convolutional block attention module[C]// Proceedings of the European Conference on Computer Vision (ECCV). 2018: 3-19.
[21]	BOCHKOVSKIY A , WANG C Y , LIAO H ,et al. YOLOv4:optimal speed and accuracy of object detection[EB]. 2020:arXiv.2004.10934.
[22]	LIN T Y , GOYAL P , GIRSHICK R ,et al. Focal loss for dense object detection[C]// Proceedings of 2017 IEEE International Conference on Computer Vision (ICCV). Piscataway:IEEE Press, 2017: 2999-3007.
[23]	GE Z , LIU S , WANG F ,et al. YOLOx:exceeding yolo series in 2021[EB]. 2021.
[24]	DUAN K W , BAI S , XIE L X ,et al. CenterNet:keypoint triplets for object detection[C]// Proceedings of the IEEE/CVF International Conference on Computer Vision. Piscataway:IEEE Press, 2019: 6569-6578.
[25]	LIN T Y , DOLLAR P , GIRSHICK R ,et al. Feature pyramid networks for object detection[C]// Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2017: 2117-2125.
[26]	LIU S , QI L , QIN H F ,et al. Path aggregation network for instance segmentation[C]// Proceedings of 2018 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2018: 8759-8768.
[27]	康帅, 章坚武, 朱尊杰 ,等. 改进 YOLOv4 算法的复杂视觉场景行人检测方法[J]. 电信科学, 2021,37(8): 46-56.
	KANG S , ZHANG J W , ZHU Z J ,et al. An improved YOLOv4 algorithm for pedestrian detection in complex visual scenes[J]. Telecommunications Science, 2021,37(8): 46-56.
[28]	HE K M , ZHANG X Y , REN S Q ,et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015,37(9): 1904-1916.
[29]	CHEN P Y , HSIEH J W , WANG C Y ,et al. Recursive hybrid fusion pyramid network for real-time small object detection on embedded devices[C]// Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Piscataway:IEEE Press, 2020: 402-403.
[30]	CAO C , WU J , ZENG X ,et al. Research on airplane and ship detection of aerial remote sensing images based on convolutional neural network[J]. Sensors, 2020,20(17): 4696.

FPN	PAN	MLIF	MARF		mAP	FPS
FPN	PAN	MLIF	RFB	SA,CA	mAP	FPS
√	√				0.648	27.2
	√	√			0.696	27.1
√			√		0.684	27.5
√				√	0.632	27.7
√			√	√	0.701	26.5
		√	√		0.733	26.1
		√		√	0.704	26.3
		√	√	√	0.769	26.0