通信学报 ›› 2013, Vol. 34 ›› Issue (10): 121-134.doi: 10.3969/j.issn.1000-436x.2013.10.015

• 技术报告 • 上一篇    下一篇

大数据典型相关分析的云模型方法

杨静,李文平(),张健沛   

  1. 哈尔滨工程大学 计算机科学与技术学院,黑龙江 哈尔滨150001
  • 出版日期:2013-10-25 发布日期:2017-08-10
  • 基金资助:
    国家自然科学基金资助项目;国家自然科学基金资助项目;国家自然科学基金资助项目;高等学校博士学科点专项科研基金资助项目;高等学校博士学科点专项科研基金资助项目;黑龙江省自然科学基金资助项目;哈尔滨市科技创新人才研究专项基金(优秀学科带头人)资助项目

Canonical correlation analysis of big data based on cloud model

Jing YANG,Wen-ping LI(),Jian-pei ZHANG   

  1. College of Computer Science and Technology,Harbin Engineering Univers ty,Harbin 150001,China
  • Online:2013-10-25 Published:2017-08-10
  • Supported by:
    The National Natural Science Foundation of China;The National Natural Science Foundation of China;The National Natural Science Foundation of China;The Research Fund for the Doctoral Program of Higher Education of China;The Research Fund for the Doctoral Program of Higher Education of China;The Natural Science Foundation of Heilong-jiang Province;The Harbin Special Funds for Technological Innovation Research

摘要:

针对传统大数据典型相关分析(CCA,canonical correlation analysis)方法的高复杂度在面临大数据PB级数据规模时不再适应的现状,提出了一种基于云模型的大数据 CCA 方法。该方法在云计算架构的基础上,通过云运算将各端点云合并为中心云,并据此产生中心云滴,以中心云滴作为大数据的不确定性复原小样本,在其上施以CCA运算,中心云滴的较小数据量提高了运算效率。在真实数据集上的实验结果验证了该方法的有效性。

关键词: 大数据, 典型相关分析, 云模型, 云运算, 云计算

Abstract:

The complexity of traditional CCA methods is too high meet the requirements to analyze big data due to their huge scale which is reaching the level of peta-byte.A novel approach to CCA was proposed to mine the big data by introducing the cloud model which is a brand-nowel theory about the uncertainty artificial intelligence.A distributed ar-chitecture based on cloud computing was established.All of the clouds distributing on the nodes of the distributed archi-tecture were combined to a center cloud via cloud operation (whe cloud is a synopsis of data and which is a concept coming from the cloud theory).A type of virtual sample of data called cloud drops created based on the center cloud.Fi-nally the computing of CCA was imposed on the cloud drops.The CCA was impose on the cloud drops with less volume,which improves the efficiency.Experimental results on real data sets indicate the effectiveness of this method.

Key words: big data, canonical correlation analysis (CCA), cloud model, cloud operation, cloud computing

No Suggested Reading articles found!