首页 | 本学科首页   官方微博 | 高级检索  
     

基于相关性分析的多标签特征选择方法
引用本文:王进,孙万彤. 基于相关性分析的多标签特征选择方法[J]. 重庆邮电大学学报(自然科学版), 2021, 33(6): 1024-1037. DOI: 10.3979/j.issn.1673-825X.201912180446
作者姓名:王进  孙万彤
作者单位:重庆邮电大学 数据工程与可视计算重点实验室,重庆400065
基金项目:国家自然科学基金(61806033)
摘    要:针对现有大多数多标签特征选择算法未能有效去除特征空间冗余特征,同时也忽略了标签差异性的现状,提出一种基于相关性分析的多标签特征选择方法,利用特征之间的相关度对特征进行分组,解决了特征之间的相关性问题.根据样本所对应的标签属性对样本做一个正负类的聚类,对于正样本和负样本所构成的正类簇和负类簇单独确定其聚类个数,并计算原特征到正负类簇中各个类中心的距离,如此便产生了标签特定特征空间;将标签共享的特征空间和标签特定特征空间融合,考虑到多个标签之间的个性和关联性,解决了标签的差异性问题.实验测试表明,相较于现有的多标签特征选择算法,提出的基于相关性分析的多标签特征选择方法在各个分类指标上均有较优的表现,充分证明了该方法的有效性.

关 键 词:机器学习  多标签学习  特征选择  关联性分析  特征空间融合
收稿时间:2019-12-18
修稿时间:2021-04-16

Multi-label feature selection method based on correlation analysis
WANG Jin,SUN Wantong. Multi-label feature selection method based on correlation analysis[J]. Journal of Chongqing University of Posts and Telecommunications, 2021, 33(6): 1024-1037. DOI: 10.3979/j.issn.1673-825X.201912180446
Authors:WANG Jin  SUN Wantong
Affiliation:Key Laboratory of Data Engineering and Visual Computing, Chongqing University of Posts and Telecommunications, Chongqing 400065, P. R. China
Abstract:Most existing multi-label feature selection algorithms fail to effectively remove redundant features in the feature space, and also ignore the difference of labels. A multi-label feature selection method based on correlation analysis is proposed. The correlation between features is used to group features, which solves the problem of correlation between features. According to the label attribute corresponding to the sample, a positive and negative cluster for the sample is made. The number of clusters is determined separately for the positive and negative clusters composed of positive and negative samples, and the distance between the original feature and the center of each class in the positive and negative clusters is calculated. In this way, a specific feature space of the label is generated. Finally, the tag shared feature space and tag specific feature space are fused, and the problem of tag difference is solved considering the personality and relevance among multiple tags. After experimental tests on 9 multi-label data sets, compared with the existing multi-label feature selection algorithms, the proposed multi-label feature selection method based on correlation analysis has better performance in each classification index, which fully proves the effectiveness of this method.
Keywords:machine learning  multi-label learning  feature selection  association analysis  feature space fusion
本文献已被 万方数据 等数据库收录!
点击此处可从《重庆邮电大学学报(自然科学版)》浏览原始摘要信息
点击此处可从《重庆邮电大学学报(自然科学版)》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号