电信科学 ›› 2023, Vol. 39 ›› Issue (3): 135-142.doi: 10.11959/j.issn.1000-0801.2023038

• 研究与开发 • 上一篇    下一篇

基于GAN数据重构的电信用户流失预测方法

阿克弘, 胡晓东   

  1. 中国电信股份有限公司西宁分公司,青海 西宁 810001
  • 修回日期:2023-03-01 出版日期:2023-03-20 发布日期:2023-03-01
  • 作者简介:阿克弘(1991- ),男,中国电信股份有限公司西宁分公司工程师、产品部主任,主要研究方向为基于电信用户数据的数据分析及数据挖掘
    胡晓东(1995- ),男,中国电信股份有限公司西宁分公司助理工程师,主要研究方向为基于风电机组运行数据的数据挖掘及故障预警、基于电信用户数据的数据分析及数据挖掘

GAN data reconstruction based prediction method of telecom subscriber loss

Kehong A, Xiaodong HU   

  1. Xining Branch of China Telecom Co., Ltd., Xining 810001, China
  • Revised:2023-03-01 Online:2023-03-20 Published:2023-03-01

摘要:

用户是运营商利益的核心。随着携号转网政策的出台,运营商之间的竞争越发激烈。为了提前精准有效地预测用户流失倾向,提出了一种基于生成对抗网络(generative adversarial network,GAN)数据重构的电信用户流失预测方法。首先,利用有效的数据预处理方法电信用户流失数据中的脏数据;其次,利用GAN重构电信用户流失数据,解决电信用户流失数据不平衡问题;最后,利用极度梯度提升树(extreme gradient boosting,XGBoost)算法分别训练基于 GAN 重构的电信用户流失预测模型和基于合成少数类过采样技术(synthetic minority oversampling technique,SMOTE)采样的电信用户流失预测模型,对比两种模型的预测精度。实验结果表明,GAN 重构后的电信用户流失预测模型预测精度比未重构的预测模型的准确率提升了6.75%,查准率提升了25.91%,召回率提升了30.91%,F1值提升了28.73%。该方法能够有效提升电信用户流失预测的准确度。

关键词: XGBoost算法, 生成对抗网络, 用户流失, 数据重构, SMOTE

Abstract:

Users are the core of operators’ interests.With the introduction of the policy of transferring network with a number, the competition between operators becomes more and more fierce.In order to accurately predict subscriber loss tendency in advance, a prediction method of subscriber loss based on generative adversarial network data reconstruction was proposed.Firstly, the dirty data in the telecom subscriber loss data was used by effective data preprocessing method.Secondly, the GAN was used to reconstruct the telecom subscriber loss data to solve the problem of the imbalance of the telecom subscriber loss data.Finally, extreme gradient boosting algorithm was used to train the telecom subscriber loss prediction model based on GAN reconstruction and the SMOTE sampling model based on synthetic minority oversampling technique sampling method respectively, and compare the prediction accuracy of the two models.The experimental results show that the prediction accuracy of the GAN reconstructed telecom subscriber loss prediction model is increased by 6.75%, the accuracy rate is increased by 25.91%, the recall rate is increased by 30.91%, and the F1-score is increased by 28.73% compared with the unreconstructed prediction model.This method can effectively improve the accuracy of telecom subscriber loss prediction.

Key words: XGBoost algorithm, generative adversarial network, customer churn, data reconstruction, synthetic mi-nority oversampling technique

No Suggested Reading articles found!