物联网学报 ›› 2018, Vol. 2 ›› Issue (2): 65-72.doi: 10.11959/j.issn.2096-3750.2018.00055

• 理论与技术 • 上一篇    下一篇

面向不平衡数据的多层神经网络模型

张雪,石志国,刘璇   

  1. 北京科技大学计算机与通信工程学院,北京 100083
  • 修回日期:2018-05-15 出版日期:2018-06-01 发布日期:2018-07-03
  • 作者简介:张雪(1995-),女,北京科技大学硕士生,主要研究方向为医疗数据分析、算法设计与分析。|石志国(1978-),男,博士,北京科技大学教授,主要研究方向为智能系统与物联网技术。|刘璇(1993-),女,北京科技大学硕士生,主要研究方向为医疗数据分析、算法设计与分析。
  • 基金资助:
    国家重点研发计划基金资助项目(2016YFC0901303)

Multilayer neural network model for unbalanced data

Xue ZHANG,Zhiguo SHI,Xuan LIU   

  1. School of Computer &Communication Engineering,University of Science and Technology Beijing,Beijing 100083,China
  • Revised:2018-05-15 Online:2018-06-01 Published:2018-07-03
  • Supported by:
    The National Key R&D Program of China(2016YFC0901303)

摘要:

传统的不平衡数据分类问题往往会因为类间数据不平衡造成分类器的性能下降。利用 AUC(ROC 曲线下的面积)为评价指标,结合单类 F-score 特征选择和遗传算法建立多层神经网络模型,选出对于不平衡数据分类更有利的特征子集,从而建立更适用于不平衡数据分类的深度模型。基于Tensor Flow建立多层神经网络模型,通过对4组不同UCI数据集进行测试,并与传统的机器学习算法如朴素贝叶斯、K最近邻、神经网络等进行对比验证。实验证明,所提模型在处理不平衡数据分类问题上的表现更优秀。

关键词: 不平衡数据, 单类F-score特征选择, 遗传算法, 多层神经网络

Abstract:

Classification of unbalanced data often has low performance of the classifier because of the unbalance of data between classes.Using AUC (the area under the ROC curve) as evaluation index,combined with one class F-score feature selection and genetic algorithm,a multilayer neural network model was established,and a more favorable feature set for unbalanced data classification was selected,so as to establish a deeper model suitable for classification of unbalanced data.Based on Tensor Flow,a multilayer neural network model was established.Using four different UCI datasets for testing,and comparing with the traditional machine learning algorithms such as Naive Bayesian,KNN,neural networks,etc,the performance of the proposed model built on the unbalanced data classification is more excellent.

Key words: unbalanced data, one class F-score feature selection, genetic algorithm, multilayer neural network

中图分类号: 

No Suggested Reading articles found!