电信科学 ›› 2022, Vol. 38 ›› Issue (10): 67-78.doi: 10.11959/j.issn.1000-0801.2022258
周薇娜, 刘露
修回日期:
2022-08-15
出版日期:
2022-10-20
发布日期:
2022-10-01
作者简介:
周薇娜(1982- ),女,博士,上海海事大学信息工程学院副教授、硕士生导师,主要研究方向为图像处理、目标检测算法和ASIC设计基金资助:
Weina ZHOU, Lu LIU
Revised:
2022-08-15
Online:
2022-10-20
Published:
2022-10-01
Supported by:
摘要:
船舶检测在军事侦察、海上目标跟踪、海上交通管制等任务中发挥着重要作用。然而,受船舶外形尺度多变和复杂海面背景的影响,在复杂海面上检测多尺度船舶仍然是一个挑战。针对此难题,提出了一种基于多层信息交互融合和注意力机制的 YOLOv4 改进方法。该方法主要通过多层信息交互融合(multi-layer information interactive fusion,MLIF)模块和多注意感受野(multi-attention receptive field,MARF)模块构建一个双向细粒度特征金字塔。其中,MLIF模块用于融合不同尺度的特征,不仅能将深层的高级语义特征串联在一起,而且将较浅层的丰富特征进行重塑;MARF由感受野模块(receptive field block,RFB)与注意力机制模块组成,能有效地强调重要特征并抑制冗余特征。此外,为了进一步评估提出方法的性能,在新加坡海事数据集(Singapore maritime dataset,SMD)上进行了实验。实验结果表明,所提方法能有效地解决复杂海洋环境下多尺度船舶检测的难题,且同时满足了实时需求。
中图分类号:
周薇娜, 刘露. 复杂场景下多尺度船舶实时检测方法[J]. 电信科学, 2022, 38(10): 67-78.
Weina ZHOU, Lu LIU. A real-time detection method for multi-scale ships in complex scenes[J]. Telecommunications Science, 2022, 38(10): 67-78.
表3
与其他目标检测方法的对比实验结果"
方法 | R | P | F1 | mAP | FPS |
Faster-RCNN[ | 0.741 | 0.851 | 0.786 | 0.775 | 7.3 |
RetinaNet[ | 0.784 | 0.731 | 0.703 | 22.9 | |
SSD[ | 0.306 | 0.768 | 0.403 | 0.473 | 40.0 |
CenterNet[ | 0.673 | 0.794 | 0.712 | 0.688 | 52.3 |
YOLOx[ | 0.644 | 0.775 | 0.689 | 0.670 | |
YOLOv3[ | 0.512 | 0.780 | 0.593 | 0.579 | 10.4 |
YOLOv4[ | 0.618 | 0.747 | 0.661 | 0.648 | 27.2 |
本文算法 | 0.678 | 26.0 |
[1] | HUANG J , JIANG Z G , ZHANG H P ,et al. Region proposal for ship detection based on structured forests edge method[C]// Proceedings of 2017 IEEE International Geoscience and Remote Sensing Symposium. Piscataway:IEEE Press, 2017: 1856-1859. |
[2] | ZHU Q Y , JIANG Y L , CHEN B . Design and implementation of video-based detection system for WHARF ship[C]// Proceedings of IET International Conference on Smart and Sustainable City 2013 (ICSSC 2013). IET, 2013: 493-496. |
[3] | LI S , ZHOU Z Q , WANG B ,et al. A novel inshore ship detection via ship head classification and body boundary determination[J]. IEEE Geoscience and Remote Sensing Letters, 2016,13(12): 1920-1924. |
[4] | LIU L , WANG X G , CHEN J ,et al. Deep learning for generic object detection:a survey[J]. International Journal of Computer Vision, 2020,128(2): 261-318. |
[5] | GIRSHICK R , DONAHUE J , DARRELL T ,et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]// Proceedings of 27th IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2014: 580-587. |
[6] | GIRSHICK R , . Fast R-CNN[C]// Proceedings of 2015 IEEE International Conference on Computer Vision (ICCV). Piscataway:IEEE Press, 2015: 1440-1448. |
[7] | REN S Q , HE K M , GIRSHICK R ,et al. Faster R-CNN:towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017,39(6): 1137-1149. |
[8] | REDMON J , DIVVALA S , GIRSHICK R ,et al. You only look once:unified,real-time object detection[C]// Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway:IEEE Press, 2016: 779-788. |
[9] | REDMON J , FARHADI A . YOLO9000:better,faster,stronger[C]// Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2017: 7263-7271. |
[10] | REDMON J , FARHADI A . YOLOv3:an incremental improvement[EB]. 2018:arXiv.1804.02767. |
[11] | LIU W , ANGUELOV D , ERHAN D ,et al. SSD:single shot multibox detector[C]// Proceedings of European Conference on Computer Vision. Cham:Springer International Publishing, 2016: 21-37. |
[12] | 张佳欣, 王华力 . 改进YOLOv3的SAR图像舰船目标检测[J]. 信号处理, 2021,37(9): 1623-1632. |
ZHANG J X , WANG H L . Ship target detection in SAR image based on improved YOLOv3[J]. Journal of Signal Processing, 2021,37(9): 1623-1632. | |
[13] | PENG X L , ZHONG R F , LI Z ,et al. Optical remote sensing image change detection based on attention mechanism and image difference[J]. IEEE Transactions on Geoscience and Remote Sensing, 2020,59(9): 7296-7307. |
[14] | 张筱晗, 姚力波, 吕亚飞 ,等. 双向特征融合的数据自适应SAR 图像舰船目标检测模型[J]. 中国图象图形学报, 2020,25(9): 1943-1952. |
ZHANG X H , YAO L B , LYU Y F ,et al. Data-adaptive single-shot ship detector with a bidirectional feature fusion module for SAR images[J]. Journal of Image and Graphics, 2020,25(9): 1943-1952. | |
[15] | SHAO Z F , WU W J , WANG Z Y ,et al. SeaShips:a large-scale precisely annotated dataset for ship detection[J]. IEEE Transactions on Multimedia, 2018,20(10): 2593-2604. |
[16] | LI Y , GUO J , GUO X ,et al. A novel target detection method of the unmanned surface vehicle under all-weather conditions with an improved YOLOv3[J]. Sensors, 2020,20(17): 4885. |
[17] | SHAO Z , WANG L , WANG Z ,et al. Saliency-aware convolution neural network for ship detection in surveillance video[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2019,30(3): 781-794. |
[18] | WANG C Y , LIAO H , WU Y H ,et al. CSPNet:a new backbone that can enhance learning capability of CNN[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Piscataway:IEEE Press, 2020: 390-391. |
[19] | LIU S , HUANG D . Receptive field block net for accurate and fast object detection[C]// Proceedings of the European Conference on Computer Vision (ECCV). 2018: 385-400. |
[20] | WOO S , PARK J , LEE J Y ,et al. Cbam:convolutional block attention module[C]// Proceedings of the European Conference on Computer Vision (ECCV). 2018: 3-19. |
[21] | BOCHKOVSKIY A , WANG C Y , LIAO H ,et al. YOLOv4:optimal speed and accuracy of object detection[EB]. 2020:arXiv.2004.10934. |
[22] | LIN T Y , GOYAL P , GIRSHICK R ,et al. Focal loss for dense object detection[C]// Proceedings of 2017 IEEE International Conference on Computer Vision (ICCV). Piscataway:IEEE Press, 2017: 2999-3007. |
[23] | GE Z , LIU S , WANG F ,et al. YOLOx:exceeding yolo series in 2021[EB]. 2021. |
[24] | DUAN K W , BAI S , XIE L X ,et al. CenterNet:keypoint triplets for object detection[C]// Proceedings of the IEEE/CVF International Conference on Computer Vision. Piscataway:IEEE Press, 2019: 6569-6578. |
[25] | LIN T Y , DOLLAR P , GIRSHICK R ,et al. Feature pyramid networks for object detection[C]// Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2017: 2117-2125. |
[26] | LIU S , QI L , QIN H F ,et al. Path aggregation network for instance segmentation[C]// Proceedings of 2018 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2018: 8759-8768. |
[27] | 康帅, 章坚武, 朱尊杰 ,等. 改进 YOLOv4 算法的复杂视觉场景行人检测方法[J]. 电信科学, 2021,37(8): 46-56. |
KANG S , ZHANG J W , ZHU Z J ,et al. An improved YOLOv4 algorithm for pedestrian detection in complex visual scenes[J]. Telecommunications Science, 2021,37(8): 46-56. | |
[28] | HE K M , ZHANG X Y , REN S Q ,et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015,37(9): 1904-1916. |
[29] | CHEN P Y , HSIEH J W , WANG C Y ,et al. Recursive hybrid fusion pyramid network for real-time small object detection on embedded devices[C]// Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Piscataway:IEEE Press, 2020: 402-403. |
[30] | CAO C , WU J , ZENG X ,et al. Research on airplane and ship detection of aerial remote sensing images based on convolutional neural network[J]. Sensors, 2020,20(17): 4696. |
[1] | 金宏辉, 简志华, 杨曼, 吴超. 采用圆周局部三值模式纹理特征的合成语音检测方法[J]. 电信科学, 2023, 39(6): 85-95. |
[2] | 马辉, 王瑞琴, 杨帅. 一种渐进式增长条件生成对抗网络模型[J]. 电信科学, 2023, 39(6): 105-113. |
[3] | 卢敏, 胡娟, 张先超, 丁伟健, 乐光学. 基于用户多特征融合的个性化推荐模型[J]. 电信科学, 2023, 39(5): 101-115. |
[4] | 张永, 刘纪奎, 柯文龙. 基于并行可分离卷积和标签平滑正则化的脑电情感识别[J]. 电信科学, 2023, 39(5): 116-128. |
[5] | 邓琨, 蒋庆丰, 刘星妍. 融合节点分析与边分析的复杂网络社区识别算法[J]. 电信科学, 2023, 39(4): 87-100. |
[6] | 冶莉娟, 王亦婷, 朱励程. 基于细胞自动机模型电力网络攻击预测技术[J]. 电信科学, 2023, 39(4): 173-179. |
[7] | 韩一士, 徐雨欣, 卢甜甜. 一种基于耦合网络的RD-IHSAT网络谣言传播模型[J]. 电信科学, 2023, 39(2): 118-131. |
[8] | 徐嘉, 简志华, 金宏辉, 吴超, 游林, 吴迎笑. 基于中心对称局部二值模式的合成伪装语音检测方法[J]. 电信科学, 2023, 39(1): 72-78. |
[9] | 任华健, 郝秀兰, 徐稳静. 融合递增词汇选择的深度学习中文输入法[J]. 电信科学, 2022, 38(12): 56-64. |
[10] | 金楠, 王瑞琴, 陆悦聪. 基于艾宾浩斯遗忘曲线和注意力机制的推荐算法[J]. 电信科学, 2022, 38(10): 89-97. |
[11] | 杨帅, 王瑞琴, 马辉. 基于多通道的边学习图卷积网络[J]. 电信科学, 2022, 38(9): 95-104. |
[12] | 赵东明. 电信运营商知识图谱技术体系研究及应用实践[J]. 电信科学, 2022, 38(8): 151-162. |
[13] | 于佳祺, 简志华, 徐嘉, 游林, 汪云路, 吴超. 基于联合特征与随机森林的伪装语音检测[J]. 电信科学, 2022, 38(6): 91-99. |
[14] | 申情, 郭文宾, 楼俊钢, 余强国. 考虑多层次潜在特征的个性化推荐模型[J]. 电信科学, 2022, 38(2): 71-83. |
[15] | 陈悦, 郭宇, 谢圆琰, 米振强. 基于图像描述算法的离线盲人视觉辅助系统[J]. 电信科学, 2022, 38(1): 61-72. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||
|