电信科学 ›› 2013, Vol. 29 ›› Issue (11): 6-11.doi: 10.3969/j.issn.1000-0801.2013.11.002

• 大数据平台与应用 • 上一篇    下一篇

基于DPI数据挖掘实现URL分类挂载的相关技术研究

边凌燕,贺仁龙,姚晓辉   

  1. 中国电信股份有限公司上海研究院 上海200122
  • 出版日期:2013-11-20 发布日期:2017-07-04

Research on URL Classification with DPI Data Mining and Related Technology

Lingyan Bian,Renlong He,Xiaohui Yao   

  1. Shanghai Research Institute of China Telecom Co., Ltd., Shanghai 200122, China
  • Online:2013-11-20 Published:2017-07-04

摘要:

通过对DPI用户上网行为数据进行深入挖掘,实现与网页URL分类体系的归类映射,是精准锁定上网用户兴趣偏好特征的关键。在梳理DPI数据自动挂载URL分类节点流程的基础上,重点研究了过程中涉及的网页信息提取、中文分词、特征选择及文本分类等关键技术,为利用DPI数据提升客户洞察能力铺平了技术道路。

关键词: 深度分组检测, 中文分词, 特征选择, 文本分类

Abstract:

In order to achieve the precise localization of internet customers' preference, the crucial point is to catch the classified mapping of URL system through digging deep into the data of DPI users' online behavior data. Based on summarizing the process of DPI data auto classification, the key technique was investigated which involved in Web information extraction, Chinese word segmentation, feature selection, text classification and so on. It may make solid technical foundations for the customer insight with DPI data.

Key words: deep packet inspection, Chinese word segmentation, feature selection, text classification

No Suggested Reading articles found!