电信科学

• • 上一篇    下一篇

一种分布式的舆情分析系统架构

黄宇鹏,袁 畅,郝志峰,蔡瑞初,肖晓军,卢 宇   

  1. 广东工业大学计算机学院;广东工业大学计算机学院;广东工业大学计算机学院;广东工业大学计算机学院;广州优亿信息科技有限公司;广州优亿信息科技有限公司
  • 出版日期:2013-07-15 发布日期:2013-07-15

A Distributed Public Opinion Analysis System Architecture

Huang Yupeng,Yuan Chang,Hao Zhifeng,Cai Ruichu,Xiao Xiaojun and Lu Yu   

  1. School of Computers, Guangdong University of Technology;School of Computers, Guangdong University of Technology;School of Computers, Guangdong University of Technology;School of Computers, Guangdong University of Technology;Guangzhou Useease Information Technology Co., Ltd.;Guangzhou Useease Information Technology Co., Ltd.
  • Online:2013-07-15 Published:2013-07-15

摘要: 随着互联网数据的快速增长,针对如何对互联网数据进行有效的收集和分析,提出一种基于分布式平台的系统架构。该架构包括爬虫模块、Web模块以及分布式平台三大模块,其中爬虫模块负责数据的收集,Web模块负责简单任务的处理以及分析结果的可视化展示,分布式平台提供数据的存储以及复杂任务的计算功能,3个模块的结合为网络上海量数据的爬取、存储与分析提供了一个很好的解决方案。最后,针对社交网站新浪微博的应用案例验证了该分布式舆情分析系统架构的可用性。

Abstract: With the rapid growth of internet data, system architecture based on distributed platform to effectively crawl was proposed and the data was analyzed. The architecture consists of three modules, crawler module, Web module and distributed platform module. Among them, the crawler module is responsible for data collection, Web module processes the simple job and gives a visual display for the analysis result, and the distributed platform module is for data storage and complicated job computing. The combination of three modules provides an excellent solution for mass data collection, storage and analysis on the internet. The effectiveness of the proposed framework was verified in the development of a public opinion mining system.

No Suggested Reading articles found!