融合一致性正则与流形正则的半监督深度学习算法

doi:10.11959/j.issn.2096-0271.2022027

摘要/Abstract

摘要：

半监督学习已被广泛应用于大数据分析。目前，基于一致性正则的方法是半监督深度学习的研究热点之一。然而这类方法没有考虑数据的流形结构，可能会导致部分相近的样本得到差异很大的输出，进而导致分类器性能下降。针对这个问题，提出了一种融合一致性正则与流形正则的半监督深度学习算法。该算法在对模型施加一致性约束的同时，对样本构图并加入平滑性损失，实现了每个样本点局部邻域的平滑以及邻近（相连）样本点之间的平滑，从而提高半监督深度学习算法的泛化性能。在多个图像和文本数据集上的实验结果表明，与其他的半监督深度学习算法相比，所提算法更有效。

关键词: 半监督深度学习, 一致性正则, 流形正则, 平滑性约束

Abstract:

Semi-supervised learning has been widely used in big data analysis.Currently, one of the hot research topics in semisupervised deep learning is consistency-based methods.However, such methods do not take into account the manifold structure of the data, which may cause a portion of similar samples to get very different outputs, resulting in degraded classifier performance.To address this problem, a semi-supervised deep learning algorithm that combines consistency regularization with manifold regularization was proposed.The algorithm imposed a consistency constraint on the model while constructing a graph and adding a smoothing loss to achieve smoothing within the local neighborhood of each sample point and between adjacent (connected) sample points, thus improving the generalization performance of the semisupervised learning algorithm.The results on several image and text datasets show that the proposed algorithm is more effective compared with other semi-supervised deep learning algorithms.

Key words: semi-supervised deep learning, consistency regularization, manifold regularization, smoothness constraint

中图分类号:

TP183

王杰, 张松岩, 梁吉业. 融合一致性正则与流形正则的半监督深度学习算法[J]. 大数据, 2022, 8(3): 103-114.

Jie WANG, Songyan ZHANG, Jiye LIANG. A semi-supervised deep learning algorithm combining consistency regularization and manifold regularization[J]. Big Data Research, 2022, 8(3): 103-114.

图/表 9

图1

图2

表1

表2

表3

表4

表5

表6

图3

参考文献 29

[1]	李国杰 . 对大数据的再认识[J]. 大数据, 2015,1(1): 1-9.
	LI G J . Further understanding of big data[J]. Big Data Research, 2015,1(1): 1-9.
[2]	梁吉业, 冯晨娇, 宋鹏 . 大数据相关分析综述[J]. 计算机学报, 2016,39(1): 1-18.
	LIANG J Y , FENG C J , SONG P . A survey on correlation analysis of big data[J]. Chinese Journal of Computers, 2016,39(1): 1-18.
[3]	胡湘红, 彭衡, 杨灿 ,等. 基因大数据的集成分析[J]. 大数据, 2019,5(4): 67-88.
	HU X H , PENG H , YANG C ,et al. Integrative analysis for big data in genomics[J]. Big Data Research, 2019,5(4): 67-88.
[4]	杨孟辉, 杜小勇 . 政府大数据治理:政府管理的新形态[J]. 大数据, 2020,6(2): 3-18.
	YANG M H , DU X Y . Big data governance in governments:a new form of the government administration[J]. Big Data Research, 2020,6(2): 3-18.
[5]	刘晓波, 蒋阳升, 唐优华 . 综合交通大数据应用技术创新平台[J]. 大数据, 2018,4(6): 78-84.
	LIU X B , JIANG Y S , TANG Y H . Innovation platform of integrated transportation big data application technology[J]. Big Data Research, 2018,4(6): 78-84.
[6]	CAO F L , YAO K X , LIANG J Y . Deconvolutional neural network for image super-resolution[J]. Neural Networks, 2020,132: 394-404.
[7]	YAO K X , CAO F L , LEUNG Y ,et al. Deep neural network compression through interpretability-based filter pruning[J]. Pattern Recognition, 2021,119:108056.
[8]	刘建伟, 刘媛, 罗雄麟 . 半监督学习方法[J]. 计算机学报, 2015,38(8): 1592-1617.
	LIU J W , LIU Y , LUO X L . Semi-supervised learning methods[J]. Chinese Journal of Computers, 2015,38(8): 1592-1617.
[9]	BERTHELOT D , CARLINI N , GOODFELLOW I ,et al. MixMatch:a holistic approach to semi-supervised learning[C]// Proceedings of the 33rd International Conference on Neural Information Processing Systems. Cambridge:The MIT Press, 2019: 5049-5059.
[10]	LAINE S , AILA T M . Temporal ensembling for semi-supervised learning[C]// Proceedings of the 5th International Conference on Learning Representations.[S.l.:s.n.], 2017.
[11]	TARVAINEN A , VALPOLA H . Mean teachers are better role models:weight-averaged consistency targets improve semi-supervised deep learning results[C]// Proceedings of the 31st Conference on Neural Information Processing Systems. Cambridge:The MIT Press, 2017: 1195-1204.
[12]	MIYATO T , MAEDA S I , KOYAMA M ,et al. Virtual adversarial training:a regularization method for supervised and semi-supervised learning[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019,41(8): 1979-1993.
[13]	VERMA V , KANNALA J , LAMB A ,et al. Interpolation consistency training for semi-supervised learning[J]. Neural Networks, 2022,145: 90-106.
[14]	XIE Q Z , DAI Z H , HOVY E ,et al. Unsupervised data augmentation for consistency training[C]// Proceedings of the 30th Advances in Neural Information Processing Systems. Cambridge:The MIT Press, 2020: 6256-6268.
[15]	VAN ENGELEN J E , HOOS H H . A survey on semi-supervised learning[J]. Machine Learning, 2020,109(2): 373-440.
[16]	ZHU X J , GHAHRAMANI Z , LAFFERTY J D . Semi-supervised learning using gaussian fields and harmonic functions[C]// Proceedings of the 20th International Conference on Machine Learning. Palo Alto:AAAI Press, 2003: 912-919.
[17]	BELKIN M , NIYOGI P , SINDHWANI V . Manifold regularization:a geometric framework for learning from labeled and unlabeled examples[J]. Journal of Machine Learning Research, 2006,7: 2399-2434.
[18]	BAI L , WANG J B , LIANG J Y ,et al. New label propagation algorithm with pairwise constraints[J]. Pattern Recognition, 2020,106:107411.
[19]	WANG J , LIANG J Q , CUI J B ,et al. Semi-supervised learning with mixedorder graph convolutional networks[J]. Information Sciences, 2021,573: 171-181.
[20]	LIANG J Y , CUI J B , WANG J ,et al. Graph-based semi-supervised learning via improving the quality of the graph dynamically[J]. Machine Learning, 2021,110(6): 1345-1388.
[21]	JOACHIMS T , . Transductive inference for text classification using support vector machines[C]// Proceedings of the 16th International Conference on Machine Learning. Bled:Morgan Kaufmann, 1999: 200-209.
[22]	BLUM A , MITCHELL T . Combining labeled and unlabeled data with cotraining[C]// Proceedings of the 11th Annual Conference on Computational Learning Theory. New York:ACM Press, 1998: 92-100.
[23]	WANG W , ZHOU Z H . A new analysis of co-training[C]// Proceedings of the 27th International Conference on Machine Learning. Haifa:Omnipress, 2010: 1135-1142.
[24]	LOOG M . Contrastive pessimistic likelihood estimation for semi-supervised classification[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016,38(3): 462-475.
[25]	SHAHSHAHANI B M , LANDGREBE D A . The effect of unlabeled samples in reducing the small sample size problem and mitigating the Hughes phenomenon[J]. IEEE Transactions on Geoscience and Remote Sensing, 1994,32(5): 1087-1095.
[26]	BAI L , LIANG J Y , CAO F Y . Semisupervised clustering with constraints of different types from multiple information sources[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021,43(9): 3247-3258.
[27]	LI Y F , ZHA H W , ZHOU Z H . Learning safe prediction for semi-supervised regression[C]// Proceedings of the 31st AAAI Conference on Artificial Intelligence. Palo Alto:AAAI Press, 2017: 2217-2223.
[28]	OLIVER A , ODENA A , RAFFEL C ,et al. Realistic evaluation of deep semisupervised learning algorithms[C]// Proceedings of the 32nd International Conference on Neural Information Processing Systems. Cambridge:The MIT Press, 2018: 3239-3250.
[29]	DEVLIN J , CHANG M W , LEE K ,et al. BERT:pre-training of deep bidirectional transformers for language understanding[C]// Proceedings of the 13th Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies. Minneapolis:Association for Computational Linguistics, 2019: 4171-4186.

数据集	训练集	验证集	测试集	特征维度	类别
CIFAR-10	45 000	5 000	10 000	32×32×3	10
CIFAR-100	45 000	5 000	10 000	32×32×3	100
SVHN	65 932	7 325	26 032	32×32×3	10
IMDB	63 000	7 000	25 000	128	2
Yahoo！Answers	45 000	5 000	60 000	128	10

算法/模型	250个	500个	1 000个
Π	53.02%	41.82%	31.53%
mean teacher	47.32%	42.01%	17.32%
VAT	36.03%	26.11%	18.68%
MixMatch	17.60%	13.77%	11.16%
SmoothMatch	14.40%	12.99%	10.22%

算法/模型	250个	500个	1 000个
Π	17.65%	11.44%	8.60%
mean teacher	5.85%	5.45%	5.21%
VAT	8.41%	7.44%	5.98%
MixMatch	5.58%	5.46%	4.45%
SmoothMatch	5.11%	5.04%	4.23%

算法/模型	错误率
Π	39.19%
mean teacher	37.17%
VAT	35.42%
MixMatch	32.88%
SmoothMatch	32.23%

算法/模型	20个	100个
BERT	32.50%	19.59%
UDA	13.70%	9.50%
SmoothMatch	12.27%	8.96%