Telecommunications Science ›› 2021, Vol. 37 ›› Issue (5): 133-147.doi: 10.11959/j.issn.1000-0801.2021025

• Research and Development • Previous Articles     Next Articles

Feature selection method for software defect number prediction based on maximum information coefficient

Guoqing LIU, Xingqi WANG, Dan WEI, Jinglong FANG, Yanli SHAO   

  1. School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou 310018, China
  • Revised:2021-01-18 Online:2021-05-20 Published:2021-05-01
  • Supported by:
    The Natural Science Foundation of Zhejiang Provincial of China(LY20F020015);The Natural Science Foundation of Zhejiang Provincial of China(LY21F020015);The National Natural Science Foundation of China(61702517);The National Natural Science Foundation of China(61972121);The National Natural Science Foundation of China(61702146);Defense Industrial Technology Development Program(JCKY2019415C001)

Abstract:

The traditional feature selection method only considers the linear correlation between variables and ignores the nonlinear correlation, so it is difficult to select effective feature subsets to build the effective model to predict the number of faults in software modules.Considering the linear and nonlinear relationship, a feature selection method based on maximum information coefficient (MIC) was proposed.The proposed method separated the redundancy analysis and correlation analysis into two phases.In the previous phase, the cluster algorithm, which was based on the correlation between features, was used to divide the redundant features into the same cluster.In the later phase, the features in each cluster were sorted in descending order according to the correlation between features and the number of software defects, and then the top features were selected to form the feature subset.The experimental results show that the proposed method can improve the prediction performance of software defect number prediction model by effectively removing redundant and irrelevant features.

Key words: software defect number prediction, feature selection, maximum information coefficient

CLC Number: 

No Suggested Reading articles found!