电信科学 ›› 2014, Vol. 30 ›› Issue (2): 65-69.doi: 10.3969/j.issn.1000-0801.2014.02.008

• 研究与开发 • 上一篇    下一篇

基于Hadoop云计算平台的图像分类与标注

陆寄远1,黄承慧1,侯昉1,李斌2   

  1. 1 广东金融学院计算机科学与技术系 广州 510521
    2 甲骨文研究开发中心(深圳)有限公司 深圳 518057
  • 出版日期:2014-02-15 发布日期:2017-06-20
  • 基金资助:
    国家自然科学基金资助项目;广东省自然科学基金资助项目;广东省自然科学基金资助项目

Image Classification and Annotation Based on Hadoop Cloud Computing Platform

Jiyuan Lu1,Chenghui Huang1,Fang Hou1,Bin Li2   

  1. 1 Department of Computer Science and Technology,Guangdong University of Finance,Guangzhou 510521,China
    2 Oracle Research and Development Center(Shenzhen)Co.,Ltd.,Shenzhen 518057,China
  • Online:2014-02-15 Published:2017-06-20

摘要:

为有效处理并利用互联网海量的图像和视频数据,提出了一种基于Hadoop云平台的图像分类和标注解决方案。针对如何高效地进行训练集提取这一重要问题,搭建了基于云计算的图像抓取平台,利用互联网的图像资源作为原始数据集,为提取训练集图像提供足够的数据;实现了基于概率潜在语义分析模型的训练集图像提取功能,对原始数据集进行基于主题的聚类,帮助用户快速选取训练集图像;加入了SVM分类模型,利用提取出来的训练集对未标注图像进行分类标注,实现了完整的系统。实验结果表明,该方案能够满足海量图像数据分类和标注的功能和性能需求。

关键词: 云计算, 训练集提取, 支持向量机, 视觉特征提取

Abstract:

In order to effectively deal with the massive image and video data in internet,a solution for image classification and annotation based on Hadoop cloud platform was proposed.Firstly,a system based on cloud computing was given to crawl the raw data image from WWW.Secondly,a training image extractor based on pLSA (probabilistic latent semantic analysis)was presented to help users to get training set images effectively.Thirdly,the SVM model for the system was integrated,which could be used to classify or annotate novel images.According to the experiment,the system can both meet the functional requirement and achieve optimized performance for image data classification and annotation.

Key words: cloud computing, training set extraction, support vector machine, visual feature extraction

No Suggested Reading articles found!