电信科学 ›› 2019, Vol. 35 ›› Issue (7): 136-144.doi: 10.11959/j.issn.1000-0801.2019090

• 运营技术广角 • 上一篇    下一篇

机器学习在物联网虚假用户识别中的运用

张溶芳1,许丹丹1,王元光2,潘思宇1,李正茂3   

  1. 1 中国联合网络通信有限公司研究院,北京 100176
    2 中国联合网络通信集团有限公司电子商务中心,北京 100033
    3 中国联合网络通信有限公司重庆市分公司,重庆 401121
  • 修回日期:2019-04-30 出版日期:2019-07-20 发布日期:2019-07-22
  • 作者简介:张溶芳(1991- ),女,中国联合网络通信有限公司研究院大数据研究中心工程师,主要研究方向为数据挖掘、建模、数据可视化等。|许丹丹(1989- ),女,中国联合网络通信有限公司研究院大数据研究中心工程师,主要研究方向为数据挖掘、建模、数据可视化。|王元光(1989- ),男,中国联合网络通信集团有限公司电子商务中心工程师,主要研究方向为数据挖掘、数据建模等。|潘思宇(1994- ),男,中国联合网络通信有限公司研究院大数据研究中心工程师,主要研究方向为数据挖掘、算法研究、数学建模等。|李正茂(1981- ),男,中国联合网络通信有限公司重庆市分公司市场部主管,主要研究方向为通信市场产品策略、大数据分析处理等。

Application of machine learning in the fake user identification of IoT

Rongfang ZHANG1,Dandan XU1,Yuanguang WANG2,Siyu PAN1,Zhengmao LI3   

  1. 1 Research Institute of China United Network Communication Co. , Ltd. , Beijing 100176, China
    2 E-Commerce Center, China United Network Communication Co. , Ltd. , Beijing 100033, China
    3 Chongqing Branch of China United Network Communication Co. , Ltd. , Chongqing 401121, China
  • Revised:2019-04-30 Online:2019-07-20 Published:2019-07-22

摘要:

随着通信技术的发展,物联网卡和5G技术将得到大规模应用,但存在个别企业利用物联网卡资费便宜、没有实名制等特点从中非法牟利、破坏社会稳定的问题,不利于行业健康发展。因此如何识别虚假用户成为物联网行业研究的重要课题。主要研究了在实时海量的物联网终端数据中,如何运用机器学习模型高效地识别疑似虚假用户。具体来看,通过研究相关数据的特征,采用基于正样本和未标记样本的半监督式学习模型建立实时监控异常行为的模型,达到识别物联网行业中潜在虚假用户的目的。本研究成果体现在节约大量人力物力的同时,可以帮助相关部门、人员及时发现用户的异常行为,采取相应的措施避免产生较大损失,具有广泛的行业应用前景。

关键词: 物联网, 半监督式学习模型, 朴素贝叶斯分类器, 随机森林, 支持向量机, SPY分类器

Abstract:

With the development of communication technology, IoT cards and 5G technologies will be applied on a large scale. However, there are some companies have taken advantage of the fact that the price of SIM cards of IoT is cheap and the cards do not have real-name registration system. It is harmful to social stability, which is not conducive to the development of IoT industry. So how to identify these fake users has become an important topic in IoT industry. The purpose was to use machine learning models to identify users who have high suspiciousness effectively. By studying the characteristics of relevant data, a semi-supervised learning model based on positive and unlabeled samples was used to establish a real-time abnormal behavior monitoring model to identify potential fake users in the IoT industry users. At the same time, the model greatly enhanced the working efficiency and has saved the manpower physical resources. Also, it can help relevant departments and governments to discover the abnormal behavior of users in time and take corresponding measures to avoid large losses. So, the proposed method really has broad application prospects in the industry.

Key words: IoT, positive and unlabeled semi-unsupervised model, Na?ve Bayesian classifier, random forest, support vector machine, SPY classifier

中图分类号: 

No Suggested Reading articles found!