电信科学 ›› 2017, Vol. 33 ›› Issue (9): 85-91.doi: 10.11959/j.issn.1000-0801.2017208

• 研究与开发 • 上一篇    下一篇

基于边界样本欠取样支持向量机的电信用户欠费分类算法

李创创,卢光跃,王航龙   

  1. 西安邮电大学无线网络安全技术国家工程实验室,陕西 西安 710121
  • 修回日期:2017-07-06 出版日期:2017-09-01 发布日期:2017-09-11
  • 作者简介:李创创(1991-),男,西安邮电大学无线网络安全技术国家工程实验室硕士生,主要研究方向为数据挖掘。|卢光跃(1971-),男,西安邮电大学无线网络安全技术国家工程实验室教授,主要研究方向为信号与信息处理、认知无线电和大数据分析。|王航龙(1989-),男,西安邮电大学无线网络安全技术国家工程实验室硕士生,主要研究方向为数据挖掘。

SVM classifier for telecom user arrears based on boundary samples-based under-sampling approaches

Chuangchuang LI,Guangyue LU,Hanglong WANG   

  1. National Engineering Laboratory for Wireless Security,Xi’an University of Posts and Telecommunications,Xi’an 710121,China
  • Revised:2017-07-06 Online:2017-09-01 Published:2017-09-11

摘要:

电信用户欠费预测是一个不平衡数据集分类问题。针对传统支持向量机(SVM)对不均衡数据集中少数类检测精度低的问题,基于分类平面由边界样本的位置决定,提出了一种通过删除部分多数类边界样本的方法来改善传统 SVM 算法的不足,将该算法和其他几种算法在电信数据和多个不平衡 UCI 数据集上的实验结果进行对比,验证所提算法对少数类的检测精度和总体评价指标都有所提高。

关键词: 欠费, 不均衡, SVM, 边界, 欠取样

Abstract:

Telecom users’ arrears forecasting is a classification problem of unbalanced data set.To deal with the problem that the traditional SVM on the unbalanced date set had a low detection accuracy of minority class,a novel method was proposed.Based on the fact that the position of classification plane was determined by the boundary samples,the proposed method was implemented via removing some of samples closed to the classification plane to avoid the deficiency of the traditional SVM algorithm.Finally,the proposed method was compared with other approaches on unbalanced data sets.The simulation results show that the proposed method can not only increase the detection accuracy of minority but also improve the overall classification performance.

Key words: arrear, unbalance, support vector machine, boundary, under-sampling

中图分类号: 

No Suggested Reading articles found!