融合多尺度信息的弱监督语义分割及优化

doi:10.11959/j.issn.1000-436x.2019004

Abstract

Abstract:

In order to improve the accuracy of weakly-supervised semantic segmentation method,a segmentation and optimization algorithm that combines multi-scale feature was proposed.The new algorithm firstly constructs a multi-scale feature model based on transfer learning algorithm.In addition,a new classifier was introduced for category prediction to reduce the failure of segmentation due to the prediction of target class information errors.Then the designed multi-scale model was fused with the original transfer learning model by different weights to enhance the generalization performance of the model.Finally,the predictions class credibility was added to adjust the credibility of the corresponding class of pixels in the segmentation map,avoiding false positive segmentation regions.The proposed algorithm was tested on the challenging VOC 2012 dataset,the mean intersection-over-union is 58.8% on validation dataset and 57.5% on test dataset.It outperforms the original transfer-learning algorithm by 12.9% and 12.3%.And it performs favorably against other segmentation methods using weakly-supervised information based on category labels as well.

Key words: deep learning, weakly-supervised learning, model integration, multi-scale feature, model optimization

CLC Number:

TP18
TP391.4

Changzhen XIONG,Hui ZHI. Weakly supervised semantic segmentation and optimization algorithm based on multi-scale feature model[J]. Journal on Communications, 2019, 40(1): 163-171.

Figures/Tables 6

References 30

[1]	关涛, 周东翔, 刘云辉 . 基于色差向量场的彩色光学显微细胞图像分割[J]. 光学学报, 2014,34(01):0115001.
	GUAN T , ZHOU D X , LIU Y H . Color optical microscopic cell image segmentation based on color difference vector field[J]. ACTA Optica Sinica, 2014,34(01):0115001.
[2]	孙延奎 . 光学相干层析医学图像处理及其应用[J]. 光学精密工程, 2014,22(04): 1086-1104.
	SUN Y K . Medical image processing techniques based on optical coherence tomography and their applications[J]. Optics and Precision Engineering, 2014,22(04): 1086-1104.
[3]	ESS A , MUELLER T , GRABNER H ,et al. Segmentation-based urban traffic scene understanding[C]// British Machine Vision Conference,BMVC. 2009(84): 1-11.
[4]	WAN J , WANG D Y , HOI S C H ,et al. Deep learning for content-based image retrieval:a comprehensive study[C]// the 22nd ACM international conference on Multimedia. 2014,978: 157-166.
[5]	OBERWEGER M , WOHLHART P , LEPETIT V . Hands deep in deep learning for hand pose estimation[C]// Computer Vision Winter Workshop. 2015: 21-30.
[6]	向守兵, 苏光大, 任小龙 ,等. 实时手指交互系统的嵌入式实现[J]. 光学精密工程, 2011,19(08): 1911-1920.
	XIANG S B , SHU G D , REN X L ,et al. Embedded implementation of real-time finger interaction system,[J]. Optics and Precision Engineering, 2011,19(08): 1911-1920.
[7]	HE K , GKIOXARI G,DOLLáR P ,et al. Mask R-CNN[C]// 2017 IEEE International Conference on Computer Vision. 2017, 2380: 2980-2988.
[8]	PATHAK D , SHELHAMER E , LONG J ,et al. Fully convolutional multi-class multiple instance learning[C]// International Conference on Learning Representations. 2015: 1-4
[9]	PATHAK D , KRAHENBUHL P , DARRELL T . Constrained convolutional neural networks for weakly supervised segmentation[C]// IEEE International Conference on Computer Vision. 2015,1550: 1796-1804.
[10]	KWAK S , HONG S , HAN B . Weakly supervised semantic segmentation using superpixel pooling network[C]// AAAI Conference on Artificial Intelligence. 2017: 4111-4117.
[11]	KOLESNIKOV A , LAMPERT C H . SEE D,Expand and constrain:three principles for weakly-supervised image segmentation[C]// European Conference on Computer Vision. 2016,9908: 695-711.
[12]	LIN L , WANG G R , ZHANG R ,et al. Deep structured scene parsing by learning with image descriptions[C]// IEEE Conference on Computer Vision and Pattern Recognition. 2016,1063: 2276-2284.
[13]	HONG S , OH J , LEE H ,et al. Learning transferrable knowledge for semantic segmentation with deep convolutional neural network[C]// IEEE Conference on Computer Vision and Pattern Recognition. 2016,1063: 3204-3212.
[14]	HONG S , YEO D , KWAK S ,et al. Weakly supervised semantic segmentation using web-crawled videos[C]// IEEE Conference on Computer Vision and Pattern Recognition. 2017,1063: 2224-2232.
[15]	BEARMAN A , RUSSAKOVSKY O , FERRARI V ,et al. What’s the Point:Semantic Segmentation with Point Supervision[C]// European Conference on Computer Vision. 2016, 9911: 549-565.
[16]	PAPANDREOU G , CHEN L C , MURPHY K ,et al. Weakly and semi-supervised learning of a DCNN for semantic image segmentation[C]// IEEE International Conference on Computer Vision. 2015, 1550: 1742-1750.
[17]	DAI J F , HE K M , SUN J . BoxSup:exploiting bounding boxes to supervise convolutional networks for semantic segmentation[C]// IEEE International Conference on Computer Vision. 2015, 1550: 1635-1643.
[18]	LIN D , DAI J F , JIA J Y ,et al. ScribbleSup:scribble-supervised convolutional networks for semantic segmentation[C]// IEEE Conference on Computer Vision and Pattern Recognition. 2016, 1063: 3159-3167.
[19]	CHEN L C , YANG Y , WANG J ,et al. Attention to scale:scale-aware semantic image segmentation[C]// IEEE Conference on Computer Vision and Pattern Recognition. 2016, 1063: 3640-3649.
[20]	YU F , KOLTUN V . Multi-Scale context aggregation by dilated convolutions[C]// International Conference on Learning Representations. 2015: 1-13
[21]	ZHAO H S , SHI J P , QI X J ,et al. Pyramid scene parsing network[C]// IEEE Conference on Computer Vision and Pattern Recognition. 2017, 1063: 6230-6239.
[22]	HONG S , ROH B , KIM K H ,et al. PVANet:lightweight deep neural networks for real-time object detection[C]// Advances in Neural Information Processing Systems. 2016: 1-7
[23]	REN S Q , HE K M , GIRSHICK R ,et al. Faster R-CNN:towards real-time object detection with region proposal networks[C]// IEEE Transactions on Pattern Analysis and Machine Intelligence. 2015: 1137-1149.
[24]	窦燕, 孔令富, 王柳锋 . 基于视觉熵的视觉注意计算模型[J]. 光学学报, 2009,29(09): 2511-2515.
	DOU Y , KONG L F , WANG L F . A computational model of visual attention based on visual entropy[J]. ACTA Optica Sinica, 2009,29(9): 2511-2515.
[25]	LIN T Y , MAIRE M , BELONGIE S ,et al. Microsoft COCO:common objects in context[C]// European Conference on Computer Vision. 2014, 8693: 740-755.
[26]	EVERINGHAM M , GOOL L , WILLIAMS C K ,et al. The pascal visual object classes (VOC) challenge[J]. International Journal of Computer Vision, 2010,88(2): 303-338.
[27]	周志华 . 机器学习[M]. 北京: 清华大学出版社, 2016: 171-184.
	ZHOU Z H . Machine learning[M]. Beijing: Tsinghua University Press, 2016: 171-184.
[28]	AHN J , KWAK S . Learning pixel-level semantic affinity with image-level supervision for weakly supervised semantic segmentation[C]// IEEE Conference on Computer Vision and Pattern Recognition. 2018: 4981-4990
[29]	QI X J , LIU Z Z , SHI J P ,et al. Augmented feedback in semantic segmentation under image level supervision[C]// European Conference on Computer Vision. 2016,9912: 90-105.
[30]	WEI Y C , FENG J S , LIANG X D ,et al. Object region mining with adversarial erasing:a simple classification to semantic segmentation approach[C]// IEEE Conference on Computer Vision and Pattern Recognition. 2017,1063: 6488-6496.

Metrics

Recommended 0

No Suggested Reading articles found!

VOC 2012 验证集	O	M	M_c	M+O_c	M+O_c_p	M+O_gt
background	85.3%	86.9%	87.1%	87.7%	87.7%	87.7%
aeroplane	68.5%	73.5%	76.1%	77.5%	77.5%	76.9%
bicycle	26.4%	27.0%	26.4%	28.5%	28.7%	27.8%
bird	69.8%	66.6%	66.6%	74.5%	73.5%	73.0%
boat	36.7%	33.8%	33.9%	38.9%	44.0%	41.6%
bottle	49.1%	52.1%	43.7%	49.8%	51.3%	45.6%
bus	68.4%	81.8%	82.8%	82.3%	82.6%	81.8%
car	55.8%	53.1%	51.7%	55.9%	58.9%	58.5%
cat	77.3%	73.7%	76.9%	81.5%	81.0%	81.2%
chair	6.2%	4.1%	7.6%	12.3%	15.1%	14.8%
cow	75.2%	74.7%	85.4%	87.0%	86.8%	86.9%
diningtable	14.3%	8.2%	10.0%	10.2%	13.0%	13.0%
dog	69.8%	61.1%	61.7%	69.4%	70.7%	71.0%
horse	71.5%	70.6%	77.1%	80.6%	80.2%	80.6%
motorbike	61.1%	67.2%	69.7%	72.5%	72.9%	72.9%
person	31.9%	45.1%	41.9%	41.1%	42.8%	40.7%
pottedplant	25.5%	20.7%	20.6%	21.1%	23.4%	22.8%
sheep	74.6%	68.4%	85.4%	87.5%	87.4%	87.4%
sofa	33.8%	31.9%	40.2%	44.5%	46.4%	46.4%
train	49.6%	58.4%	60.3%	59.9%	59.9%	59.9%
tvmonitor	43.7%	46.7%	49.7%	52.8%	51.6%	50.3%
mIoU	52.1%	52.6%	55.0%	57.9%	58.8%	58.1%

算法	监督类型	平均交并比
算法	监督类型	验证集	测试集
SEC(ECCV 2016)^[11]	I	50.7%	51.7%
*TransferNet (CVPR 2016)^[13]	I	52.1%	51.2%
*AF-MCG (ECCV 2016)^[29]	I	54.3%	55.5%
AE-PSL(CVPR 2017)^[30]	I	55.0%	55.7%
*CrawlSeg ( CVPR 2017)^[14]	I	58.1%	58.7%
AffinityNet((DeepLab，2018)^[28]	I	58.4%	60.5%
What'sPoint(ECCV 2016)^[15]	P	46.0%	43.6%
WSSL (ICCV 2015)^[16]	B	60.6%	62.2%
BoxSup(ICCV 2015)^[17]	B	62.0%	64.2%
Scribblesup(CVPR 2016)^[18]	S	63.1%	—
M+O_c_p	I	58.8%	57.5%
注：“*”表示该类算法加入强监督信息。

Weakly supervised semantic segmentation and optimization algorithm based on multi-scale feature model

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 6

References 30

Related Articles 15

Metrics

Recommended 0

[1]	Dongyu CHEN, Hua CHEN, Limin FAN, Yifang FU, Jian WANG. Research on test strategy for randomness based on deep learning [J]. Journal on Communications, 2023, 44(6): 23-33.
[2]	Rongpeng LI, Bingyan WANG, Honggang ZHANG, Zhifeng ZHAO. Design of knowledge enhanced semantic communication receiver [J]. Journal on Communications, 2023, 44(6): 70-76.
[3]	Shuai MA, Ke PEI, Huayan QI, Hang LI, Wen CAO, Hongmei WANG, Hailiang XIONG, Shiyin LI. Research on geomagnetic indoor high-precision positioning algorithm based on generative model [J]. Journal on Communications, 2023, 44(6): 211-222.
[4]	Jie YANG, Biao DONG, Xue FU, Yu WANG, Guan GUI. Lightweight decentralized learning-based automatic modulation classification method [J]. Journal on Communications, 2022, 43(7): 134-142.
[5]	Xiuzhang YANG, Guojun PENG, Zichuan LI, Yangqi LYU, Side LIU, Chenguang LI. Research on entity recognition and alignment of APT attack based on Bert and BiLSTM-CRF [J]. Journal on Communications, 2022, 43(6): 58-70.
[6]	Yong LIAO, Shiyi WANG. CSI feedback algorithm based on RM-Net for massive MIMO systems in high-speed mobile environment [J]. Journal on Communications, 2022, 43(5): 166-176.
[7]	Yurong LIAO, Haining WANG, Cunbao LIN, Yang LI, Yuqiang FANG, Shuyan NI. Research progress of deep learning-based object detection of optical remote sensing image [J]. Journal on Communications, 2022, 43(5): 190-203.
[8]	Zenghua ZHAO, Yuefan TONG, Jiayang CUI. Device-independent Wi-Fi fingerprinting indoor localization model based on domain adaptation [J]. Journal on Communications, 2022, 43(4): 143-153.
[9]	Yong LIAO, Gang CHENG, Yujie LI. CSI feedback algorithm based on deep unfolding for massive MIMO systems [J]. Journal on Communications, 2022, 43(12): 77-88.
[10]	Xueyuan DUAN, Yu FU, Kun WANG, Bin LI. LDoS attack detection method based on simple statistical features [J]. Journal on Communications, 2022, 43(11): 53-64.
[11]	Junyan HUO, Ruipeng QIU, Yanzhuo MA, Fuzheng YANG. Reference frame list optimization algorithm in video coding by quality enhancement of the nearest picture [J]. Journal on Communications, 2022, 43(11): 136-147.
[12]	Haiyan KANG, Yuanrui JI. Research on federated learning approach based on local differential privacy [J]. Journal on Communications, 2022, 43(10): 94-105.
[13]	Hongxia ZHANG, Qi WANG, Dengyue WANG, Ben WANG. Honeypot contract detection of blockchain based on deep learning [J]. Journal on Communications, 2022, 43(1): 194-202.
[14]	Yan YAN, Yiming CONG, Mahmood Adnan, Quanzheng SHENG. Statistics release and privacy protection method of location big data based on deep learning [J]. Journal on Communications, 2022, 43(1): 203-216.
[15]	Ye ZHU, Yilin YU, Yingchun GUO. HRDA-Net: image multiple manipulation detection and location algorithm in real scene [J]. Journal on Communications, 2022, 43(1): 217-226.