通信学报 ›› 2018, Vol. 39 ›› Issue (5): 111-122.doi: 10.11959/j.issn.1000-436x.2018082
张俐1,2,王枞1,2
修回日期:
2018-04-18
出版日期:
2018-05-01
发布日期:
2018-06-01
作者简介:
张俐(1977-),男,陕西汉中人,北京邮电大学博士生,主要研究方向为机器学习、特征工程、医疗健康数据分析挖掘。|王枞(1958-),女,北京人,博士,北京邮电大学教授、博士生导师,主要研究方向为智能信息处理、网络信息安全、可信计算与服务、医疗健康数据分析挖掘。
基金资助:
Li ZHANG1,2,Cong WANG1,2
Revised:
2018-04-18
Online:
2018-05-01
Published:
2018-06-01
Supported by:
摘要:
在过去的几十年中,特征选择已经在机器学习和人工智能领域发挥着重要作用。许多特征选择算法都存在着选择一些冗余和不相关特征的现象,这是因为它们过分夸大某些特征重要性。同时,过多的特征会减慢机器学习的速度,并导致分类过渡拟合。因此,提出新的基于前向搜索的非线性特征选择算法,该算法使用互信息和交互信息的理论,寻找与多分类标签相关的最优子集,并降低计算复杂度。在UCI中9个数据集和4个不同的分类器对比实验中表明,该算法均优于原始特征集和其他特征选择算法选择出的特征集。
中图分类号:
张俐,王枞. 基于最大相关最小冗余联合互信息的多标签特征选择算法[J]. 通信学报, 2018, 39(5): 111-122.
Li ZHANG,Cong WANG. Multi-label feature selection algorithm based on joint mutual information of max-relevance and min-redundancy[J]. Journal on Communications, 2018, 39(5): 111-122.
表2
基于3KNN分类器的所选特征集平均准确率"
序号 | FullSet算法 | JMMC算法 | IG算法 | FCBF算法 | ReliefF算法 | ||||||||
准确度 | 特征 | 准确度 | 特征 | 准确度 | 特征 | 准确度 | 特征 | 准确度 | |||||
1 | 67.037% | 8 | 71.111% | 8 | 66.296% | 8 | 66.296% | 6 | 66.296% | ||||
2 | 96.013% | 31 | 96.846% | 28 | 91.027% | 34 | 90.455% | 24 | 97.171% | ||||
3 | 80% | 87 | 80.666% | 85 | 79.222% | 85 | 79.222% | 61 | 79.555% | ||||
4 | 92.633% | 16 | 94.024% | 26 | 92.633% | 26 | 92.633% | 7 | 92.633% | ||||
平均值 | 83.921% | 35.5 | 85.662% | 36.75 | 82.294% | 38.25 | 82.151% | 24.5 | 83.914% |
表3
基于C4.5分类器的所选特征集平均准确率"
序号 | FullSet算法 | JMMC算法 | IG算法 | FCBF算法 | ReliefF算法 | ||||||||
准确度 | 特征 | 准确度 | 特征 | 准确度 | 特征 | 准确度 | 特征 | 准确度 | |||||
1 | 76.296% | 8 | 78.148% | 7 | 76.296% | 12 | 74.814% | 11 | 75.925% | ||||
2 | 94.735% | 26 | 96.124% | 25 | 94.146% | 32 | 93.297% | 29 | 95.029% | ||||
3 | 68.444% | 74 | 67.777% | 53 | 66.666% | 67 | 67.333% | 27 | 68.444% | ||||
4 | 95.099% | 6 | 95.437% | 16 | 94.384% | 16 | 94.739% | 23 | 94.212% | ||||
平均值 | 83.643% | 28.5 | 84.371% | 25.25 | 82.873% | 31.75 | 82.545% | 22.5 | 83.402% |
表4
基于SVM分类器的所选特征集平均准确率"
序号 | FullSet算法 | JMMC算法 | IG算法 | FCBF算法 | ReliefF算法 | ||||||||
准确度 | 特征 | 准确度 | 特征 | 准确度 | 特征 | 准确度 | 特征 | 准确度 | |||||
1 | 70.37% | 8 | 71.851% | 5 | 59.259% | 1 | 57.407% | 2 | 55.925% | ||||
2 | 97.434% | 25 | 97.743% | 24 | 93.169% | 34 | 92.645% | 31 | 97.728% | ||||
3 | 52.666% | 61 | 53.555% | 89 | 51.111% | 89 | 51.111% | 84 | 53.555% | ||||
4 | 88.973% | 10 | 94.569% | 5 | 89.111% | 5 | 89.111% | 1 | 69.197% | ||||
平均值 | 77.361% | 26 | 79.429% | 30.75 | 73.162% | 32.25 | 72.568% | 29.5 | 69.101% |
表5
基于混合分类器和不同算法的平均准确率"
序号 | FullSet算法 | JMMC算法 | IG算法 | FCBF算法 | ReliefF算法 | ||||||||
准确度 | 特征 | 准确度 | 特征 | 准确度 | 特征 | 准确度 | 特征 | 准确度 | |||||
1 | 68.765 % | 4 | 79.51 % | 6 | 66.173 % | 5 | 71.111% | 6 | 66.42 % | ||||
2 | 95.722% | 21 | 96.952 % | 34 | 92.416% | 22 | 96.692% | 23 | 93.522% | ||||
3 | 65.888% | 81 | 66.777% | 85 | 65.148% | 87 | 64.962% | 82 | 66.037% | ||||
4 | 87.301% | 7 | 93.926 % | 5 | 91.107 % | 8 | 90.288% | 5 | 91.284% | ||||
平均值 | 79.419% | 28.25 | 84.291% | 32.5 | 78.711% | 30.5 | 80.763% | 29 | 79.316% |
表7
基于3KNN分类器的所选特征集平均准确率"
序号 | FullSet算法 | JMMC算法 | IG算法 | FCBF算法 | ReliefF算法 | ||||||||
准确度 | 特征 | 准确度 | 特征 | 准确度 | 特征 | 准确度 | 特征 | 准确度 | |||||
5 | 66.345% | 133 | 67.278% | 155 | 67.278% | 183 | 65.601% | 40 | 65.06% | ||||
6 | 77.967% | 128 | 79.516% | 153 | 79.516% | 118 | 78.203% | 83 | 78.407% | ||||
7 | 97.7% | 63 | 97.7% | 64 | 97.7% | 64 | 97.7% | 64 | 97.7% | ||||
8 | 100% | 6 | 100% | 11 | 100% | 17 | 100% | 19 | 98.425% | ||||
9 | 92.676% | 23 | 96.214% | 36 | 96.214% | 16 | 92.614% | 36 | 95.776% | ||||
平均值 | 86.937% | 70.6 | 88.155% | 83.8 | 88.141% | 79.6 | 86.823% | 48.4 | 87.073% |
表8
基于C4.5分类器的所选特征集平均准确率"
序号 | FullSet算法 | JMMC算法 | IG算法 | FCBF算法 | ReliefF算法 | ||||||||
准确度 | 特征 | 准确度 | 特征 | 准确度 | 特征 | 准确度 | 特征 | 准确度 | |||||
5 | 65.846% | 168 | 71.302% | 228 | 67.177% | 172 | 67.442% | 242 | 67.229% | ||||
6 | 76.926% | 152 | 77.121% | 158 | 75.694% | 162 | 75.06% | 163 | 76.31% | ||||
7 | 84.849% | 17 | 84.049% | 14 | 84.299% | 11 | 85.049% | 10 | 84.8% | ||||
8 | 100% | 6 | 100% | 11 | 100% | 17 | 98.425% | 19 | 99.852% | ||||
9 | 100% | 23 | 100% | 36 | 100% | 16 | 100% | 36 | 100% | ||||
平均值 | 85.524% | 73.2 | 86.494% | 89.4 | 85.434% | 75.6 | 85.195% | 94 | 85.638% |
表9
基于SVM分类器的所选特征集平均准确率"
序号 | FullSet算法 | JMMC算法 | IG算法 | FCBF算法 | ReliefF算法 | ||||||||
准确度 | 特征 | 准确度 | 特征 | 准确度 | 特征 | 准确度 | 特征 | 准确度 | |||||
5 | 56.946% | 68 | 59.012% | 3 | 56.946% | 3 | 56.946% | 5 | 56.946% | ||||
6 | 60.728% | 6 | 63.517% | 2 | 58.644% | 2 | 58.644% | 2 | 59.247% | ||||
7 | 94.65% | 64 | 94.65% | 64 | 94.65% | 64 | 94.65% | 64 | 94.65% | ||||
8 | 97.047% | 1 | 98.523% | 1 | 98.523% | 1 | 98.523% | 18 | 97.023% | ||||
9 | 100% | 23 | 100% | 36 | 100% | 16 | 100% | 36 | 100% | ||||
平均值 | 81.874% | 32.4 | 83.14% | 21.2 | 81.752% | 17.2 | 81.752% | 25 | 81.573% |
表10
基于不同分类器和不同算法的平均准确率"
序号 | FullSet算法 | JMMC算法 | IG算法 | FCBF算法 | ReliefF算法 | ||||||||
准确度 | 特征 | 准确度 | 特征 | 准确度 | 特征 | 准确度 | 特征 | 准确度 | |||||
5 | 62.853% | 171 | 63.935% | 189 | 62.983% | 237 | 62.904% | 255 | 63.364% | ||||
6 | 70.198% | 158 | 70.484% | 166 | 69.913% | 162 | 69.991% | 165 | 70.129% | ||||
7 | 91.416% | 63 | 91.549% | 61 | 91.333% | 64 | 91.466% | 63 | 91.383% | ||||
8 | 98.15% | 6 | 98.757% | 11 | 98.014% | 20 | 96.806% | 19 | 97.839% | ||||
9 | 97.558% | 23 | 98.738% | 36 | 97.538% | 36 | 97.485% | 36 | 97.444% | ||||
平均值 | 84.035% | 84.2 | 84.693% | 92.6 | 83.957% | 103.8 | 83.73% | 107.6 | 84.032% |
[1] | GEORGE G , HAAS M R , PENTLAND A . Big data and management[J]. Academy of Management Journal, 2014,57(2): 321-326. |
[2] | XIE J Y , XIE W X . Several selection algorithms based on the discernibility of a feature subset and support vector machines[J]. Chinese Journal of Computers, 2014,37(8): 1704-1718. |
[3] | BROWN G , POCOCK A , ZHAO M J ,et al. Conditional likelihood maximisation-a unifying framework for information theoretic feature selection[J]. Journal of Machine Learning Research, 2012,13: 27-66. |
[4] | CHENG H G , QIN Z , FENG C ,et al. Conditional mutual information based feature selection analysing for synergy and redundancy[J]. Electronics and Telecommunications Research Institute, 2011(33): 210-218. |
[5] | CHANDRASHEKAR G , SAHIN F . A survey on feature selection methods[J]. Computers and Electrical Engineering, 2014(40): 16-28. |
[6] | ZHANG Z H , LI S N , LI Z G ,et al. Multi-label feature selection algorithm based on information entropy[J]. Journal of Computer Research and Development, 2013,50(6): 1177-1184. |
[7] | YU H , YANG J . A direct LDA algorithm for high-dimensional data with application to face recognition[J]. Pattern Recognition, 2001(34): 2067-2070. |
[8] | BAJWA I S , NAWEED M S , ASIF M N ,et al. Feature based image classification by using principal component analysis[J]. ICGST International Journal on Graphics Vision and Image Processing, 2009(9): 11-17. |
[9] | MALDONADO S , WEBER R . A wrapper method for feature selection using support vector machine[J]. Information Science, 2009,179(13): 2208-2217. |
[10] | PENG C . Distributed K-Means clustering algorithm based on Fisher discriminant ratio[J]. Journal of Jiangsu University, 2014,35(4): 422-427. |
[11] | ZHANG Y S , YANG A , XIONG C ,et al. Feature selection using data envelopment analysis[J]. Knowledge-Based Systems, 2014(64): 70-80. |
[12] | YU L , LIU H . Feature selection for high-dimensional data:a fast correlation-based filter solution[C]// The 20th International Conferences on machine learning. 2003: 856-863. |
[13] | HUANG D , CHOW T W S . Effective feature selection scheme using mutual information[J]. Neurocomputing, 2005(63): 325-343. |
[14] | LIU H W , SUN J G , LIU L ,et al. Feature selection with dynamic mutual information[J]. IEEE Transactions on Neural Networks, 2009,20(2): 189-201. |
[15] | DUAN H X , ZHANG Q Y , ZHANG M . FCBF algorithm based on normalized mutual information for feature selection[J]. Journal Huazhong University of Science & Technology(Natural Science Edition), 2017,45(1): 52-56. |
[16] | SUN G L , SONG Z C , LIU J L ,et al. Feature selection method based on maximum information coefficient and approximate markov blanket[J]. Acta Automatica Sinica, 2017,43(5): 795-805. |
[17] | VERGARA J R , ESTEVEZ P . A review of feature selection methods based on mutual information[J]. Neural Computing and Applications, 2014,24(1): 175-186. |
[18] | KWAK N , CHOI C H . Input feature selection for classification problems[J]. IEEE Transactions on Neural Networks, 2002(13): 143-159. |
[19] | ESTéVEZ P A , TESMER M , PEREZ C A ,et al. Normalized mutual information feature selection[J]. IEEE Transaction on Neural Networks, 2009(20): 189-201. |
[20] | HOQUE N , BHATTACHARYYA D K , KALITA J K . MIFS-ND:a mutual information-based feature selection method[J]. Expert Systems with Applications, 2014,41(14): 6371-6385. |
[21] | HOWARD H Y , JOHN M . Feature selection based on joint mutual information[C]// Advances in Intelligent Data Analysis (AIDA),Computational Intelligence Methods and Applications (CIMA),International Computer Science Conventions Rochester New York. 1999: 1-8. |
[22] | PENG H , LONG F , DING C . Feature selection based on mutual information:criteria of max-dependency,max-relevance,and min- redundancy[C]// IEEE Transaction on Pattern Analysis & Machine Intelligence. 2005,27(8): 1226-1238. |
[23] | VINH L T , THANG N D , LEE Y K . An improved maximum relevance and minimum redundancy feature selection algorithm based on normalized mutual information[C]// Tenth International Symposium on Applications and the Internet. 2010: 395-398. |
[24] | LEE J , KIM D W . Mutual information-based multi-label feature selection using interaction information[J]. Expert Systems with Applications, 2015(42): 2013-2025. |
[25] | COVER T , THOMAS J . Elements of theory[M]. New York: John Wiley & Sons, 2002. |
[26] | JAKULIN A . Attribute interactions in machine learning (Master thesis)[M]// Lecture Notes in Computer Science. 2003. |
[27] | JOHN G H , KOHAVI R , PFLEGER K . Irrelevant features and the subset selection problem[C]// The Eleventh International Conference on Machine Learning, 1994: 121-129. |
[28] | BENNASAR M , HICKS Y , SETCHI R . Feature selection using Joint mutual information maximisation[J]. Expert System Application, 2015(42): 8520-8532. |
[29] | ZHANG Y S , ZHANG Z G . Feature subset selection with cumulate conditional mutual information minimization[J]. Expert Systems with Applications, 2012,39(5): 6078-6088. |
[30] | YU L , LIU H . Efficient feature selection via analysis of relevance and redundancy[J]. Journal of Machine Learning Research, 2004,5(12): 1205-1224. |
[31] | TAPIA E , BULACIO P , ANGELONE L F . Sparse and stable gene selection with consensus SVM-RFE[J]. Pattern Recognition Letters, 2012,33(2): 164-172. |
[32] | UNLER A , MURAT A , CHINNAM R B . mr2PSO:a maximum relevance minimum redundancy feature selection method based on swarm intelligence for support vector machine classification[J]. Information Sciences, 2011(20): 4625-4641. |
[33] | CHE J X , YANG Y L , LI L ,et al. Maximum relevance minimum common redundancy feature selection for nonlinear data[J]. Information Sciences, 2017(5): 68-89. |
[34] | CHAKRABORTY R , PAL N R . Feature selection using a neural framework with controlled redundancy[J]. IEEE Transactions on Neural Networks and Learning Systems, 2015,26(1): 35-50. |
[35] | AKADI A E , OUARDIGHI A , ABOURAJDINE D . A powerful feature selection approach based on mutual information[J]. International Journal of Computer Science and Network Security, 2008(8): 116-211. |
[36] | FLEURET F . Fast binary feature selection with conditional mutual information[J]. Journal of Machine Learning Research, 2004(5): 1531-1555. |
[37] | NIU X T . Support vector extracted algorithm based on KNN and 10 fold cross-validation method[J]. Journal of Huazhong Normal University, 2014,48(3): 335-338. |
[1] | 仪双燕, 梁永生, 陆晶晶, 柳伟, 胡涛, 何震宇. 联合低秩重构和投影重构的稳健特征选择方法[J]. 通信学报, 2023, 44(3): 209-219. |
[2] | 李永豪, 胡亮, 张平, 高万夫. 基于动态图拉普拉斯的多标签特征选择[J]. 通信学报, 2020, 41(12): 47-59. |
[3] | 李占山, 刘兆赓. 基于XGBoost的特征选择算法[J]. 通信学报, 2019, 40(10): 101-108. |
[4] | 张震,魏鹏,李玉峰,兰巨龙,徐萍,陈博. 改进粒子群联合禁忌搜索的特征选择算法[J]. 通信学报, 2018, 39(12): 60-68. |
[5] | 王勇,周慧怡,俸皓,叶苗,柯文龙. 基于深度卷积神经网络的网络流量分类方法[J]. 通信学报, 2018, 39(1): 14-23. |
[6] | 张宇翔,孙菀,杨家海,周达磊,孟祥飞,肖春景. 新浪微博反垃圾中特征选择的重要性分析[J]. 通信学报, 2016, 37(8): 24-33. |
[7] | 武小年,彭小金,杨宇洋,方堃. 入侵检测中基于SVM的两级特征选择方法[J]. 通信学报, 2015, 36(4): 19-26. |
[8] | 陈亮,龚俭. 基于NetFlow记录的高速应用流量分类方法[J]. 通信学报, 2012, 33(1): 145-152. |
[9] | 程国振,程东年,俞定玖. 基于多尺度低秩模型的网络异常流量检测方法[J]. 通信学报, 2012, 33(1): 182-190. |
[10] | 陈铁明,马继霞,宣以广,蔡家楣. 快速特征选择方法及其在入侵检测中的应用[J]. 通信学报, 2010, 31(9A): 233-238. |
[11] | 卓莹,龚春叶,龚正虎. 网络传输态势感知的研究与实现[J]. 通信学报, 2010, 31(9): 55-64. |
[12] | 王博,贾焰,杨树强,周斌. 适用于不确定文本分类的特征选择算法[J]. 通信学报, 2009, 30(8): 32-38. |
[13] | 李洋,郭莉,陆天波,田志宏. TCM-KNN网络异常检测算法优化研究[J]. 通信学报, 2009, 30(7): 13-19. |
[14] | 张晓惠,林柏钢. 基于特征选择和多分类支持向量机的异常检测[J]. 通信学报, 2009, 30(10A): 68-73. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||
|