大数据 ›› 2022, Vol. 8 ›› Issue (3): 103-114.doi: 10.11959/j.issn.2096-0271.2022027

• 研究 • 上一篇    下一篇

融合一致性正则与流形正则的半监督深度学习算法

王杰1,2, 张松岩1,2, 梁吉业1,2   

  1. 1 山西大学计算机与信息技术学院,山西 太原 030006
    2 计算智能与中文信息处理教育部重点实验室,山西 太原 030006
  • 出版日期:2022-05-15 发布日期:2022-05-01
  • 作者简介:王杰(1990- ),男,山西大学计算机与信息技术学院博士生,主要研究方向为数据挖掘和机器学习。
    张松岩(1996- ),男,山西大学计算机与信息技术学院硕士生,主要研究方向为数据挖掘和机器学习。
    梁吉业(1962- ),男,博士,山西大学计算机与信息技术学院教授,中国计算机学会会士,主要研究方向为数掘挖掘与机器学习,在国内外重要期刊和会议上发表学术论文200余篇。
  • 基金资助:
    国家自然科学基金资助项目(61976184);山西省重点研发计划项目(201903D121162);山西省1331工程项目

A semi-supervised deep learning algorithm combining consistency regularization and manifold regularization

Jie WANG1,2, Songyan ZHANG1,2, Jiye LIANG1,2   

  1. 1 School of Computer and Information Technology, Shanxi University, Taiyuan 030006, China
    2 Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education, Taiyuan 030006, China
  • Online:2022-05-15 Published:2022-05-01
  • Supported by:
    The National Natural Science Foundation of China(61976184);The Key Project of Research and Development Plan of Shanxi Province(201903D121162);The 1331 Engineering Project of Shanxi Province

摘要:

半监督学习已被广泛应用于大数据分析。目前,基于一致性正则的方法是半监督深度学习的研究热点之一。然而这类方法没有考虑数据的流形结构,可能会导致部分相近的样本得到差异很大的输出,进而导致分类器性能下降。针对这个问题,提出了一种融合一致性正则与流形正则的半监督深度学习算法。该算法在对模型施加一致性约束的同时,对样本构图并加入平滑性损失,实现了每个样本点局部邻域的平滑以及邻近(相连)样本点之间的平滑,从而提高半监督深度学习算法的泛化性能。在多个图像和文本数据集上的实验结果表明,与其他的半监督深度学习算法相比,所提算法更有效。

关键词: 半监督深度学习, 一致性正则, 流形正则, 平滑性约束

Abstract:

Semi-supervised learning has been widely used in big data analysis.Currently, one of the hot research topics in semisupervised deep learning is consistency-based methods.However, such methods do not take into account the manifold structure of the data, which may cause a portion of similar samples to get very different outputs, resulting in degraded classifier performance.To address this problem, a semi-supervised deep learning algorithm that combines consistency regularization with manifold regularization was proposed.The algorithm imposed a consistency constraint on the model while constructing a graph and adding a smoothing loss to achieve smoothing within the local neighborhood of each sample point and between adjacent (connected) sample points, thus improving the generalization performance of the semisupervised learning algorithm.The results on several image and text datasets show that the proposed algorithm is more effective compared with other semi-supervised deep learning algorithms.

Key words: semi-supervised deep learning, consistency regularization, manifold regularization, smoothness constraint

中图分类号: 

No Suggested Reading articles found!