网络与信息安全学报 ›› 2018, Vol. 4 ›› Issue (5): 21-31.doi: 10.11959/j.issn.2096-109x.2018043
江玉朝,吉立新,高超,李邵梅
修回日期:
2018-05-02
出版日期:
2018-05-01
发布日期:
2018-08-04
作者简介:
江玉朝(1994-),江苏盐城人,信息工程大学硕士生,主要研究方向为计算机视觉。|吉立新(1969-),江苏淮安人,博士,信息工程大学研究员,主要研究方向为通信与信息系统。|高超(1982-),河南郑州人,博士,信息工程大学助理研究员,主要研究方向为计算机视觉。|李邵梅(1982-),湖北钟祥人,博士,信息工程大学副研究员,主要研究方向为计算机视觉。
基金资助:
Yuchao JIANG,Lixin JI,Chao GAO,Shaomei LI
Revised:
2018-05-02
Online:
2018-05-01
Published:
2018-08-04
Supported by:
摘要:
针对深度学习框架下Logo识别任务中可训练样本稀疏的问题,提出了一种基于上下文的Logo数据合成算法,该算法综合利用了Logo对象内部、Logo周围邻域、Logo与其他对象之间以及Logo所处场景等多种类型的上下文信息指导Logo图像的合成。在FlickrLogos-32数据集上的实验结果显示,所提算法能够在不依赖额外手工标注的前提下,提升Logo识别算法的性能(mAP提升8.5%),验证了该合成算法的有效性。
中图分类号:
江玉朝,吉立新,高超,李邵梅. 面向Logo识别的合成数据生成方法研究[J]. 网络与信息安全学报, 2018, 4(5): 21-31.
Yuchao JIANG,Lixin JI,Chao GAO,Shaomei LI. Research on synthesis data generation method for logo recognition[J]. Chinese Journal of Network and Information Security, 2018, 4(5): 21-31.
表1
现有的Logo识别数据集"
数据集 | Logo类别数 | 对象总数 | 图像总数 | 是否公开 |
BelgaLogos[7] | 37 | 2 695 | 1 951 | 是 |
FlickrLogos-27[8] | 27 | 4 671 | 1 080 | 是 |
FlickrLogos-32[9] | 32 | 3 404 注1 | 2 240 | 是 |
LOGO-NET[10] | 160 | 130 608 | 73 414 | 否 |
Logos-32plus[11] | 32 | 12 302 | 7 830 | 是 |
TopLogo10[12] | 10 | 863 | 700 | 是 |
注1 已与文献[12]的作者取得联系,证实文献[12]中关于FlickrLogos-32数据集对象数量的统计有误,正确数量为3 404。 |
表2
本文合成算法实验结果及与文献[12]的对比"
AP值 | ||||||||||
方法 | 训练集/测试集划分(每类图片数) | Adidas | Aldi | Apple | Becks | BMW | Carls | Chim | Coke | mAP |
Corona | DHL | Erdi | Esso | Fedex | Ferra | Ford | Fost | |||
Guin | Hein | HP | Milka | Nvid | Paul | Pepsi | ||||
Ritt | Shell | Sing | Starb | Stel | Texa | Tsin | Ups | |||
23.7 | 57.5 | 63.0 | 69.6 | 63.7 | 50.6 | 55.2 | 26.8 | |||
RealImg([12]) | 训练:10 真实 | 79.0 | 25.8 | 61.2 | 44.2 | 45.9 | 80.6 | 64.3 | 43.2 | 50.4% |
测试:60 真实 | 47.7 | 58.2 | 61.8 | 21.3 | 19.4 | 17.4 | 48.2 | 17.8 | ||
34.8 | 45.8 | 71.8 | 70.2 | 79.6 | 56.7 | 56.9 | 52.2 | |||
9.4 | 47.3 | 9.6 | 70.3 | 39.9 | 28.3 | 15.8 | 21.7 | |||
SynImg-32Cls | 训练:100 合成 | 6.1 | 11.1 | 4.1 | 44.7 | 22.9 | 60.9 | 43.6 | 28.8 | 27.6% |
([12]) | 测试:60 真实 | 23.0 | 16.7 | 43.1 | 9.9 | 4.6 | 1.1 | 39.1 | 9.7 | |
22.7 | 38.3 | 15.5 | 65.6 | 28.7 | 55.1 | 27.4 | 20.1 | |||
26.8 | 63.7 | 65.8 | 72.7 | |||||||
SynImg-32Cls + | 训练:100 合成 | 76.0 | 31.5 | 63.0 | 52.2 | 46.6 | 54.8% | |||
RealImg([12]) | 精调:10 真实 | 58.0 | 52.6 | 65.2 | 23.2 | 24.0 | 12.5 | 54.1 | ||
测试:60 真实 | 37.9 | 75.0 | 79.0 | 64.2 | 57.4 | 54.4 | ||||
24.4 | 57.2 | 66.6 | 72.0 | 70.8 | 42.8 | 55.3 | 24.8 | |||
RealImg(Ours) | 训练:10 真实 | 82.8 | 29.5 | 62.5 | 44.1 | 42.7 | 87.2 | 59.3 | 39.9 | 50.5% |
测试:60 真实 | 51.2 | 54.6 | 65.1 | 24.2 | 15.5 | 16.9 | 52.3 | 17.4 | ||
32.3 | 44.7 | 72.7 | 69.7 | 77.3 | 62.5 | 52.6 | 44.3 | |||
54.0 | 34.6 | 44.1 | 17.6 | 18.4 | 33.9 | 20.6 | ||||
SynImg-32Cls | 训练:100 合成 | 8.2 | 21.5 | 12.1 | 49.6 | 16.6 | 28.2 | 35.4 | 50.3 | 32.6% |
(Ours) | 测试:60 真实 | 46.4 | 40.0 | 33.1 | 26.1 | 9.9 | 18.8 | 64.5 | 22.3 | |
39.4 | 45.5 | 15.8 | 55.5 | 17.6 | 47.5 | 38.7 | 39.9 | |||
31.0 | 76.6 | 51.1 | 62.1 | 29.8 | ||||||
SynImg-32Cls+ | 训练:100 合成 | 49.0 | 87.9 | 76.9 | ||||||
RealImg(Ours) | 精调:10 真实 | 22.8 | ||||||||
测试:60 真实 | 45.5 | 72.5 | ||||||||
36.2 | 61.3 | 30.1 | 46.9 | 25.3 | 30.8 | 32.2 | 22.3 | |||
SynImg-32Cls + | 训练:100 合成 +10 真实 | 11.4 | 16.7 | 19.6 | 47.1 | 28.0 | 33.2 | 33.8 | 50.1 | 34.2% |
RealImg(fusion) | (混合) | 45.7 | 41.6 | 37.1 | 25.6 | 13.0 | 16.3 | 63.7 | 20.9 | |
(Ours) | 测试:60 真实 | 33.4 | 45.0 | 14.1 | 62.7 | 22.8 | 46.4 | 43.5 | 36.6 | |
SynImg-32Cls + | 训练:100 合成 +10 真实 | 31.1 | 71.2 | 71.6 | 71.7 | 84.4 | 48.0 | 66.3 | 28.5 | |
RealImg(fusion) | (混合) | 84.4 | 33.3 | 82.0 | 51.7 | 51.0 | 90.3 | 76.4 | 54.8 | 58.9% |
+RealImg(Ours) | 精调:10 真实 | 59.6 | 67.8 | 69.4 | 37.8 | 23.2 | 22.9 | 66.0 | 23.6 | |
测试:60 真实 | 40.6 | 46.0 | 75.3 | 77.2 | 83.5 | 67.8 | 63.6 | 64.2 |
表3
本文合成算法每项改进对性能影响的定量实验结果"
AP值 | |||||||||
Adidas | Aldi | Apple | Becks | BMW | Carls | Chim | Coke | ||
方法 | Corona | DHL | Erdi | Esso | Fedex | Ferra | Ford | Fost | mAP |
Guin | Hein | HP | Milka | Nvid | Paul | Pepsi | |||
Ritt | Shell | Sing | Starb | Stel | Texa | Tsin | Ups | ||
31.0 | 69.0 | 74.3 | 76.6 | 51.1 | 29.8 | ||||
SynImg-32Cls+ | 35.3 | 70.9 | 53.5 | 49.0 | 76.9 | 52.9 | |||
RealImg | 61.7 | 66.9 | 69.3 | 25.8 | 22.8 | ||||
(Our Baseline) | 41.8 | 45.5 | 72.5 | 79.7 | 69.2 | 62.1 | |||
32.8 | 69.7 | 77.6 | 76.8 | 61.8 | 29.1 | ||||
Transparent Only+ | 88.0 | 35.9 | 52.9 | 43.9 | 87.5 | 76.3 | 47.8 | 57.8% | |
RealImg | 59.3 | 61.5 | 68.3 | 33.2 | 24.8 | 24.3 | 67.4 | 18.5 | |
47.5 | 78.6 | 73.6 | 80.3 | 69.2 | 61.3 | 61.2 | |||
32.7 | 66.7 | 73.8 | 75.5 | 77.4 | 48.3 | 59.7 | |||
Pixel-level Only+ | 87.5 | 71.7 | 53.6 | 87.7 | 72.2 | 53.5 | 58.1% | ||
RealImg | 65.9 | 71.3 | 29.7 | 25.4 | 62.9 | ||||
36.5 | 75.4 | 79.3 | 66.6 | 63.4 | 59.3 | ||||
28.5 | 65.2 | 68.2 | 74.5 | 46.6 | 60.0 | 30.8 | |||
Random Context+ | 86.4 | 33.5 | 70.1 | 50.4 | 83.5 | 75.9 | 52.2 | 56.7% | |
RealImg | 61.6 | 63.3 | 69.8 | 28.5 | 26.3 | 22.5 | 62.6 | 23.8 | |
35.6 | 47.2 | 74.6 | 74.0 | 79.5 | 61.0 | 56.1 | |||
30.5 | 63.7 | 72.0 | 73.3 | 73.4 | 49.0 | 57.2 | 27.1 | ||
No Logo Transform+ | 85.1 | 33.9 | 68.1 | 52.8 | 47.6 | 85.0 | 75.4 | 48.6 | 56.3% |
RealImg | 56.2 | 62.2 | 69.1 | 33.9 | 24.0 | 26.0 | 67.6 | 23.6 | |
34.0 | 46.3 | 73.5 | 72.9 | 80.5 | 69.0 | 61.8 | 58.4 | ||
73.6 | 74.6 | 77.0 | 50.0 | 61.3 | 30.5 | ||||
Random Position+ | 83.0 | 35.8 | 71.7 | 53.9 | 47.0 | 86.6 | 58.2% | ||
RealImg | 57.9 | 33.7 | 20.4 | 24.9 | 64.3 | 24.1 | |||
41.7 | 47.5 | 77.3 | 74.4 | 68.4 | 60.2 |
[1] | 符亚彬 . 基于 Logo 标志检测的暴恐视频识别系统的设计与实现[D]. 北京:北京交通大学, 2016. |
FU Y B . Design and implementation of violence and fear video recognition system based on Logo mark detection[D]. Beijing:Beijing Jiaotong University, 2016. | |
[2] | GAO Y , WANG F , LUAN H ,et al. Brand data gathering from live social media streams[C]// ACM International Conference on Multimedia Retrieval. 2014:169. |
[3] | PAN C , YAN Z , XU X ,et al. Vehicle logo recognition based on deep learning architecture in video surveillance for intelligent traffic system[C]// IET International Conference on Smart and Sustainable City. 2013: 123-126. |
[4] | HE K , GKIOXARI G , DOLLAR P ,et al. Mask R-CNN[C]// IEEE International Conference on Computer Vision. 2017: 2980-2988. |
[5] | WANG X , SHRIVASTAVA A , GUPTA A . A-Fast-RCNN:hard positive generation via adversary for object detection[C]// IEEE Conference on Computer Vision and Pattern Recognition. 2017: 3039-3048. |
[6] | LIU W , ANGUELOV D , ERHAN D ,et al. SSD:single shot MultiBox detector[M]// Computer Vision-ECCV 2016. Springer International Publishing, 2016: 21-37. |
[7] | JOLY A , BUISSON O . Logo retrieval with a contrario visual query expansion[C]// International Conference on Multimedia 2009. 2009: 581-584. |
[8] | KALANTIDIS Y , PUEYO L G , TREVISIOL M ,et al. Scalable triangulation-based logo recognition[C]// ACM International Conference on Multimedia Retrieval. 2011: 1-7. |
[9] | ROMBERG S , PUEYO L G , LIENHART R ,et al. Scalable logo recognition in real-world images[C]// ACM International Conference on Multimedia Retrieval. 2011:25. |
[10] | HOI S C H , WU X , LIU H ,et al. LOGO-Net:Large-scale deep logo detection and brand recognition with deep region-based convolutional networks[J]. IEEE Transactions on Pattern Analysis &Machine Intelligence, 2015,46(5): 2403-2412. |
[11] | BIANCO S , BUZZELLI M , MAZZINI D ,et al. Deep learning for logo recognition[J]. Neuro Computing, 2017,245(C): 23-30. |
[12] | SU H , ZHU X , GONG S . Deep learning logo detection with data expansion by synthesising context[C]// IEEE Winter Conference on Applications of Computer Vision. 2017: 530-539. |
[13] | CHEN X , GUPTA A . Webly supervised learning of convolutional networks[C]// IEEE International Conference on Computer Vision. 2016: 1431-1439. |
[14] | SHRIVASTAVA A , GUPTA A , GIRSHICK R . Training region-based object detectors with online hard example mining[C]// IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2016). 2016: 761-769. |
[15] | GUPTA A , VEDALDI A , ZISSERMAN A . Synthetic data for text localisation in natural images[C]// IEEE Computer Vision and Pattern Recognition. 2016: 2315-2324. |
[16] | JADERBERG M , SIMONYAN K , VEDALDI A ,et al. Reading text in the wild with convolutional neural networks[J]. International Journal of Computer Vision, 2016,116(1): 1-20. |
[17] | GEORGAKIS G , MOUSAVIAN A , BERG A C ,et al. Synthesizing training data for object detection in indoor scenes[C]// Robotics:Science and Systems. 2017. |
[18] | EGGERT C , WINSCHEL A , LIENHART R . On the benefit of synthetic data for company logo detection[C]// ACM International Conference on Multimedia. 2015: 1283-1286. |
[19] | REN S , HE K , GIRSHICK R ,et al. Faster R-CNN:towards real-time object detection with region proposal networks[C]// International Conference on Neural Information Processing Systems. 2015: 91-99. |
[20] | BENGIO Y , COLLOBERT R , WESTON J . Curriculum learning[C]// ACM International Conference on Machine Learning. 2009: 41-48. |
[21] | LIU B . Modest proposal for the principle of logo design[J]. Packaging Engineering, 2005,127(2): 222-222. |
[22] | OLIVA A , TORRALBA A . The role of context in object recognition[J]. Trends in Cognitive Sciences, 2007,11(12): 520. |
[23] | MOTTAGHI R , CHEN X , LIU X ,et al. The role of context for object detection and semantic segmentation in the wild[C]// IEEE Computer Vision and Pattern Recognition. 2014: 891-898. |
[24] | KATTI H , PEELEN M V , ARUN S P . How do targets,nontargets,and scene context influence real-world object detection?[J]. Attention Perception & Psychophysics, 2017(2): 1-16. |
[25] | ZHOU B , LAPEDRIZA A , KHOSLA A ,et al. Places:a 10 million image database for scene recognition[J]. IEEE Trans Pattern Anal Mach Intell, 2017,99: 1-1. |
[26] | GUO J , GOULD S . Deep CNN ensemble with data augmentation for object detection[J]. Computer Science, 2015. |
[27] | OLIVEIRA G , FRAZ?O X , PIMENTEL A ,et al. Automatic graphic logo detection via fast region-based convolutional networks[C]// IEEE International Joint Conference on Neural Networks. 2016. |
[28] | MUNNEKE J , BRENTARI V , PEELEN M . The influence of scene context on object recognition is independent of attentional focus[J]. Frontiers in Psychology, 2013,4(8): 552. |
[29] | NGUYEN H V , HO H T , PATEL V M ,et al. DASH-N:joint hierarchical domain adaptation and feature learning[J]. IEEE Transactions on Image Processing, 2015,24(12): 5479-5491. |
[1] | 冯冠云, 付才, 吕建强, 韩兰胜. 基于操作注意力和数据增强的内部威胁检测[J]. 网络与信息安全学报, 2023, 9(3): 102-112. |
[2] | 李晓萌, 郭玳豆, 卓训方, 姚恒, 秦川. 载体独立的抗屏摄信息膜叠加水印算法[J]. 网络与信息安全学报, 2023, 9(3): 135-149. |
[3] | 谢绒娜, 马铸鸿, 李宗俞, 田野. 基于卷积神经网络的加密流量分类方法[J]. 网络与信息安全学报, 2022, 8(6): 84-91. |
[4] | 章登勇, 文凰, 李峰, 曹鹏, 向凌云, 杨高波, 丁湘陵. 基于双分支网络的图像修复取证方法[J]. 网络与信息安全学报, 2022, 8(6): 110-122. |
[5] | 林佳滢, 周文柏, 张卫明, 俞能海. 空域频域相结合的唇型篡改检测方法[J]. 网络与信息安全学报, 2022, 8(6): 146-155. |
[6] | 陈晋音, 吴长安, 郑海斌. 基于softmax激活变换的对抗防御方法[J]. 网络与信息安全学报, 2022, 8(2): 48-63. |
[7] | 邱宝琳, 易平. 基于多维特征图知识蒸馏的对抗样本防御方法[J]. 网络与信息安全学报, 2022, 8(2): 88-99. |
[8] | 李丽娟, 李曼, 毕红军, 周华春. 基于混合深度学习的多类型低速率DDoS攻击检测方法[J]. 网络与信息安全学报, 2022, 8(1): 73-85. |
[9] | 秦中元, 贺兆祥, 李涛, 陈立全. 基于图像重构的MNIST对抗样本防御算法[J]. 网络与信息安全学报, 2022, 8(1): 86-94. |
[10] | 邹德清, 李响, 黄敏桓, 宋翔, 李浩, 李伟明. 基于图结构源代码切片的智能化漏洞检测系统[J]. 网络与信息安全学报, 2021, 7(5): 113-122. |
[11] | 王正龙, 张保稳. 生成对抗网络研究综述[J]. 网络与信息安全学报, 2021, 7(4): 68-85. |
[12] | 李炳龙, 佟金龙, 张宇, 孙怡峰, 王清贤, 常朝稳. 基于TensorFlow的恶意代码片段自动取证检测算法[J]. 网络与信息安全学报, 2021, 7(4): 154-163. |
[13] | 谭清尹, 曾颖明, 韩叶, 刘一静, 刘哲理. 神经网络后门攻击研究[J]. 网络与信息安全学报, 2021, 7(3): 46-58. |
[14] | 杨路辉,白惠文,刘光杰,戴跃伟. 基于可分离卷积的轻量级恶意域名检测模型[J]. 网络与信息安全学报, 2020, 6(6): 112-120. |
[15] | 刘西蒙,谢乐辉,王耀鹏,李旭如. 深度学习中的对抗攻击与防御[J]. 网络与信息安全学报, 2020, 6(5): 36-53. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||
|