大数据 ›› 2022, Vol. 8 ›› Issue (4): 133-144.doi: 10.11959/j.issn.2096-0271.2022066

• 研究 • 上一篇    下一篇

混合型数据的邻域条件互信息熵属性约简算法

兰海波   

  1. 中国气象局公共气象服务中心,北京 100081
  • 出版日期:2022-07-15 发布日期:2022-07-01
  • 作者简介:兰海波(1979- ),男,中国气象局公共气象服务中心高级工程师,主要研究方向为大数据处理技术、自然语言处理技术、数据库技术和气象服务信息系统的关键技术及应用

Neighborhood conditional mutual information entropy attribute reduction algorithm for hybrid data

Haibo LAN   

  1. CMA Public Meteorological Service Centre, Beijing 100081, China
  • Online:2022-07-15 Published:2022-07-01

摘要:

属性约简是粗糙集理论的重要研究内容之一,其主要目的是消除信息系统中不相关的属性,降低数据维度并提高数据知识发现性能。然而,基于粗糙集的属性约简方法大多没有考虑属性之间的依赖性,使得最终的属性约简结果存在一定的冗余属性。对此,提出一种基于邻域条件互信息熵的属性约简算法。首先,在传统邻域熵的基础上,针对混合型数据,提出混合型邻域互信息熵模型和混合型邻域条件互信息熵模型;然后利用这两种熵模型进行混合型信息系统的属性依赖度评估和属性启发式搜索,并设计出一种属性约简算法;最后通过UCI数据集的实验分析,证明了提出的算法具有较高的属性约简性能。

关键词: 粗糙集, 属性约简, 邻域, 互信息熵, 条件互信息熵

Abstract:

Attribute reduction is an important research content of the rough set theory.Its main purpose is to eliminate irrelevant attributes in information systems, reduce data dimensions and improve data knowledge discovery performance.However, most of the attribute reduction methods based on a rough set do not consider the dependence between attributes, which makes the final attribute reduction result have some redundant attributes.An attribute reduction algorithm based on neighborhood conditional mutual information entropy was proposed.Firstly, based on the traditional neighborhood entropy, a hybrid neighborhood mutual information entropy model and a hybrid neighborhood conditional mutual information entropy model were proposed for hybrid data.Then, the two entropy models were used to evaluate the attribute dependence and attribute heuristic search of the hybrid information system, and an attribute reduction algorithm was designed.Finally, through the experimental analysis of UCI data sets, it was proved that the algorithm had higher attribute reduction performance.

Key words: rough set, attribute reduction, neighborhood, mutual information entropy, conditional mutual information entropy

中图分类号: 

No Suggested Reading articles found!