首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于自适应邻域互信息与谱聚类的特征选择
引用本文:孙林,梁娜,徐久成.基于自适应邻域互信息与谱聚类的特征选择[J].山东大学学报(理学版),2022,57(12):13-24.
作者姓名:孙林  梁娜  徐久成
作者单位:1.河南师范大学计算机与信息工程学院, 河南 新乡 453007;2.智慧商务与物联网技术河南省工程实验室, 河南 新乡 453007
基金项目:国家自然科学基金资助项目(62076089,61976082);河南省科技攻关资助项目(212102210136)
摘    要:借鉴邻域粗糙集处理连续型数据的优势,为解决传统谱聚类算法需要人工选取参数的问题,提出基于自适应邻域互信息与谱聚类的特征选择算法。首先,定义各对象在属性下的标准差集合与自适应邻域集,给出自适应邻域熵、平均邻域熵、联合熵、邻域条件熵、邻域互信息等不确定性度量,利用自适应邻域互信息对特征与标签的相关性进行排序。然后,结合共享近邻自适应谱聚类算法,将相关性强的特征聚到同一特征簇内,使不同特征簇内的特征强相异。最后,使用最小冗余最大相关技术设计特征选择算法。在10个数据集上选择特征个数与分类精度的实验结果,验证了所提算法的有效性。

关 键 词:特征选择  邻域粗糙集  邻域互信息  谱聚类  最小冗余最大相关  

Feature selection using adaptive neighborhood mutual information and spectral clustering
SUN Lin,LIANG Na,XU Jiu-cheng.Feature selection using adaptive neighborhood mutual information and spectral clustering[J].Journal of Shandong University,2022,57(12):13-24.
Authors:SUN Lin  LIANG Na  XU Jiu-cheng
Institution:1. College of Computer and Information Engineering, Henan Normal University, Xinxiang 453007, Henan, China;2. Henan Engineering Laboratory of Smart Business and Internet of Things Technology, Xinxiang 453007, Henan, China
Abstract:In order to deal with the problem that traditional spectral clustering algorithms need set parameters manually, this paper proposes a feature selection algorithm based on adaptive neighborhood mutual information and spectral clustering, which takes the advantage of neighborhood rough sets to deal with continuous data. First, the standard deviation set and adaptive neighborhood set of each object on attribute are defined. Some uncertainty measures such as adaptive neighborhood entropy, average neighborhood entropy, joint entropy, neighborhood conditional entropy and neighborhood mutual information are given, and then the adaptive neighborhood mutual information is used to sort the correlation between features and labels. Second, the shared nearest neighbor spectral clustering algorithm is combined to cluster the strongly relevant features into the same feature cluster, so that the features in the different feature clusters are strongly diverse. Finally, the feature selection algorithm is designed by employing the minimum redundancy and maximum correlation technology. The experimental results of selecting the number of features and classification accuracy on ten datasets verify the effectiveness of the proposed algorithm.
Keywords:feature selection  neighborhood rough set  adaptive neighborhood mutual information  spectral clustering  minimum redundancy and maximum correlation  
点击此处可从《山东大学学报(理学版)》浏览原始摘要信息
点击此处可从《山东大学学报(理学版)》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号